aboutsummaryrefslogblamecommitdiffstatshomepage
path: root/docs/online_duration.md
blob: 7ca7c86e16d9561886d628688efcbac8b9126d05 (plain) (tree)
1
2
3
4
5
6
7
8
9







                                                      
                                                   
 







                                                                              




                                                                          


                                                               




                                      






                                                                           


















                                                           
























































                                                                               
                                                                               

                                                                               

                                  


                                                                    


           


                                                                                    


           


                                                                                          


              


                                                                                    
































                                                                              
online_duration.py
==================

View/visualize the amount of time people spend online.

Usage
-----

Run from the top-level directory using `python -m`:

```
> python -m bin.online_duration -h
usage: online_duration.py [-h] [--grouping {user,date,weekday,hour}]
                          [--input-format {csv,log,null}]
                          [--output-format {csv,json,plot}] [--from DATE_FROM]
                          [--to DATE_TO]
                          input [output]
```

This script additionally requires [matplotlib] to be installed.

Analyze the database produced by [track_status.py] and calculate the total
amount of time people spent online.
For example (assuming the database in "db.csv" was generated by
[track_status.py] before):

```
> python -m bin.online_duration db.csv
89497105,John,Smith,john.smith,0:12:31
3698577,Jane,Smith,jane.smith,1:34:46
```

In the example above, "John Smith" and "Jane Smith" spent approx. 13 and 95
minutes online respectively.

The output format is CSV (comma-separated values) by default.
You can also get a JSON document:

```
> python -m bin.online_duration --output-format json db.csv
[
   {
      "uid": 89497105,
      "first_name": "John",
      "last_name": "Smith",
      "domain": "john.smith",
      "duration": "0:12:31"
   },
   {
      "uid": 3698577,
      "first_name": "Jane",
      "last_name": "Smith",
      "domain": "jane.smith",
      "duration": "1:34:46"
   }
]
```

The durations are calculated on a per-user basis by default.
You can change that by supplying either `date` (to group by dates), `weekday`
(to group by weekdays) or `hour` (to group by day hours) as the `--grouping`
parameter value.
For example (assuming that both Jane and Joe spent their time online on Friday,
June 17, 2016).

```
> python -m bin.online_duration --output-format json --grouping date db.csv
[
   {
      "date": "2016-06-17",
      "duration": "1:47:17"
   }
]
```

```
> python -m bin.online_duration --output-format csv --grouping weekday db.csv
Monday,0:00:00
Tuesday,0:00:00
Wednesday,0:00:00
Thursday,0:00:00
Friday,1:47:17
Saturday,0:00:00
Sunday,0:00:00
```

```
> python -m bin.online_duration --grouping hour db.csv
0:00:00,0:00:00
1:00:00,0:00:00
2:00:00,0:00:00
3:00:00,0:00:00
4:00:00,0:03:56
5:00:00,0:14:14
6:00:00,0:29:30
7:00:00,0:31:20
8:00:00,0:12:04
9:00:00,0:00:00
10:00:00,0:00:00
11:00:00,0:23:14
12:00:00,0:06:00
13:00:00,0:46:19
14:00:00,0:00:00
15:00:00,0:00:00
16:00:00,0:00:00
17:00:00,0:00:00
18:00:00,0:00:00
19:00:00,0:00:00
20:00:00,0:00:00
21:00:00,0:00:00
22:00:00,0:00:00
23:00:00,0:00:00
```

In my opinion, the script's most useful feature is its ability to easily create
plots that represent this data (like in the examples above).
To produce a plot, pass `plot` as the `--output-format` parameter value and add
a file path to write the image to.

```
> python -m bin.online_duration --output-format plot db.csv user.png
```

![user.png]

```
> python -m bin.online_duration --output-format plot --grouping date db.csv date.png
```

![date.png]

```
> python -m bin.online_duration --output-format plot --grouping weekday db.csv weekday.png
```

![weekday.png]

```
> python -m bin.online_duration --output-format plot --grouping hour db.csv hour.png
```

![hour.png]

You can limit the scope of the database by supplying a time range.
Only online durations that are within the supplied range shall then be
processed.
Set the range by specifying both or one of the `--from` and `--to` parameters.
Values must be in the `%Y-%m-%dT%H:%M:%SZ` format (a subset of ISO 8601).

All dates and times are in UTC.

[matplotlib]: http://matplotlib.org/
[track_status.py]: track_status.md

[user.png]: images/user.png
[date.png]: images/date.png
[weekday.png]: images/weekday.png
[hour.png]: images/hour.png

Known issues
------------

* When people go online using the web version and don't visit other pages over
time (for example, just listening to music), they appear offline.
Hence the 0:00:00 durations you might sometimes encounter.
This might also happen using other clients.

See also
--------

* [License]

[License]: ../README.md#license