Simple data processing

最後編輯:2016-01-31 建立:2015-09-30 歷史紀錄

 

WUULONG SWhen you get data from server, there are some of simple way to analyse your data

 

Linux CLI - download data

wget ftp://gpssensor.ddns.net:2121/data.log

 

Linux CLI - parse data

 

example data:

LASS/Test/MAPS |ver_format=1|fmt_opt=0|app=MAPS|ver_app=0.6.6|device_id=LASS-MAPS-LJ|tick=22411953|date=30/9/15|time=11:9:30|device=LinkItONE|values=1971.00,100.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,1008.33,27.80,99.90,0.00,0.00,0.00,0.00,0.00,0.00,0.00|gps=$GPGGA,110930.002,2502.4606,N,12136.8711,E,0,0,,133.6,M,15.3,M,,*42

 

purpose: filter app=MAPS and device_id=LASS-MAPS-LJ, get the sensor data column 12. and plot it.

 

command:

cat data.log | grep 'app=MAPS' | grep 'device_id=LASS-MAPS-LJ' | awk -F '|' '{print $11}' | awk -F '=' '{print $2}' | awk -F ',' '{print $12}'

 

How it work:

  • filter app=MAPS
  • filter device_id=LASS-MAPS-LJ
  • get the 11th column by using separator '|' => values=1971.00,100.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,1008.33,27.80,99.90,0.00,0.00,0.00,0.00,0.00,0.00,0.00
  • get the second column by using separator '=' => 1971.00,100.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,1008.33,27.80,99.90,0.00,0.00,0.00,0.00,0.00,0.00,0.00
  • get the 12th column by using separator ',' => 27.80

 

You can just copy all results into google spreadsheet to get the char

Linux CLI - get all device ids

 

command:

cat data.log | awk -F '|' '{print $6}' | awk -F '=' '{print $2}' | sort | uniq -c

 

with version information

cat data.log | awk -F '|' '{print $5"="$6}' | awk -F '=' '{print $2" "$4}' | sort | uniq -c

 

MQTT topic unique

cat data.log | awk '{print $1}' | sort | uniq

 

Webduino COPY device count

cat data.log | grep 'WEBDUINO_COPY' | awk -F '|' '{print $10}' | awk -F '=' '{print $2}' | sort | uniq

 

 

Get statistic information

This example work with R in the command line, and need R be installed

 

Refer this for detail

 

New this file and named as stat.r, make sure it can be executed.

  • #! /usr/bin/env Rscript
  • d<-scan("stdin", quiet=TRUE)
  • cat(min(d), max(d), median(d), mean(d),sd(d), sep=",")
  • cat("\n")

 

Command

  • previous cmd | stat.r

 

example result

  • 18.3,38.3,29.9,30.14143,1.957374

 

Plot sensors data with time

Get time data as column 1

cat data.log | grep 'app=MAPS' | grep 'device_id=LASS-MAPS-LJ' | awk -F '|' '{print $9}' | awk -F '=' '{print $2}' > col1.tmp

 

Get sensor data as column 2

cat data.log | grep 'app=MAPS' | grep 'device_id=LASS-MAPS-LJ' | awk -F '|' '{print $11}' | awk -F '=' '{print $2}' | awk -F ',' '{print $12}' > col2.tmp

 

combine 2 columns to CSV format

pr -mt -s, col1.tmp col2.tmp > cols.csv

 

Import this to google spreadsheet and plot it.