aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/README.md
blob: 48c6103ea2baff21f6ebf80f2e5de30a01413c75 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
VK scripts
==========

A collection of scripts abusing VK.com API.
Requires Python 3.4 or higher.

Usage
-----

Pass the `--help` flag to a script to see its detailed usage information.

### track_status.py

Track when people go online/offline.

    usage: track_status.py [-h] [-t TIMEOUT] [-l LOG]
                           [--output-format {csv,log,null}] [-o OUTPUT]
                           UID [UID ...]

For example (using made up user IDs/"screen names"),

    > track_status.py john.doe jane.smith
    [2016-06-18 01:43:34] John Doe is ONLINE.
    [2016-06-18 01:43:34] John Doe was last seen at 2016-06-18 01:33:58+03:00 using the official iPhone app.
    [2016-06-18 01:43:34] Jane Smith is OFFLINE.
    [2016-06-18 01:43:34] Jane Smith was last seen at 2016-06-18 01:15:47+03:00 using the web version (or an unrecognized app).
    [2016-06-18 01:59:09] Jane Smith went ONLINE.
    [2016-06-18 01:59:09] Jane Smith was last seen at 2016-06-18 01:59:07+03:00 using the official Android app.
    [2016-06-18 02:10:00] John Doe went OFFLINE.
    [2016-06-18 02:10:00] John Doe was last seen at 2016-06-18 01:54:58+03:00 using the official iPhone app.
    ...

By default, the script produces a human-readable log.
Use the `--log` parameter to write the log to a file.
If you want to record when people go online/offline for [further analysis],
specify the path to a database using the `--output` parameter.
Be careful: if the file already exists, it will be overwritten!

[further analysis]: #online_durationpy

### online_duration.py

View the amount of time people spent online.

    usage: online_duration.py [-h] [--grouping {user,date,weekday}]
                              [--input-format {csv,log,null}]
                              [--output-format {csv,json,img}]
                              input [output]

This script additionally requires [matplotlib] to be installed.

Analyze the database produced by [track_status.py] and calculate the total
amount of time people spent online.

For example (assuming the database in "db.csv" was generated by
[track_status.py] before):

    > online_duration.py db.csv
    89497105,John,Smith,john.smith,0:12:31
    3698577,Jane,Smith,jane.smith,1:34:46

In the example above, "John Smith" and "Jane Smith" spent approx. 13 and 95
minutes online respectively.

The output format is CSV (comma-separated values) by default.
You can also get a JSON document:

    > online_duration.py --output-format json db.csv
    [
       {
          "uid": 89497105,
          "first_name": "John",
          "last_name": "Smith",
          "screen_name": "john.smith",
          "duration": "0:12:31"
       },
       {
          "uid": 3698577,
          "first_name": "Jane",
          "last_name": "Smith",
          "screen_name": "jane.smith",
          "duration": "1:34:46"
       }
    ]

The durations are calculated on a per-user basis by default.
You can change that by supplying either `date` (to group by dates), `weekday`
(to group by weekdays) or `hour` (to group by day hours) as the `--grouping`
parameter value.
For example (assuming that both Jane and Joe spent their time online on Friday,
June 17, 2016).

```
> online_duration.py --output-format json --grouping date db.csv
[
   {
      "date": "2016-06-17",
      "duration": "1:47:17"
   }
]
```

```
> online_duration.py --output-format csv --grouping weekday db.csv
Monday,0:00:00
Tuesday,0:00:00
Wednesday,0:00:00
Thursday,0:00:00
Friday,1:47:17
Saturday,0:00:00
Sunday,0:00:00
```

```
> online_duration.py --grouping hour db.csv
0:00:00,0:00:00
1:00:00,0:00:00
2:00:00,0:00:00
3:00:00,0:00:00
4:00:00,0:03:56
5:00:00,0:14:14
6:00:00,0:29:30
7:00:00,0:31:20
8:00:00,0:12:04
9:00:00,0:00:00
10:00:00,0:00:00
11:00:00,0:23:14
12:00:00,0:06:00
13:00:00,0:46:19
14:00:00,0:00:00
15:00:00,0:00:00
16:00:00,0:00:00
17:00:00,0:00:00
18:00:00,0:00:00
19:00:00,0:00:00
20:00:00,0:00:00
21:00:00,0:00:00
22:00:00,0:00:00
23:00:00,0:00:00
```

In my opinion, the script's most useful feature is the ability to easily create
plots that represent the text data (like in the examples above).
To produce a plot, pass `img` as the `--output-format` parameter value and add
a file path to write the image to.

    > online_duration.py --output-format img db.csv user.png

![user.png]

    > online_duration.py --output-format img --grouping date db.csv date.png

![date.png]

    > online_duration.py --output-format img --grouping weekday db.csv weekday.png

![weekday.png]

    > online_duration.py --output-format img --grouping hour db.csv hour.png

![hour.png]

You can limit the scope of the database by supplying a time range.
Only online durations that are within the supplied range shall then be
processed.
Set the range by specifying both or one of the `--from` and `--to` parameters.
Values must be in the `%Y-%m-%dT%H:%M:%SZ` format (a subset of ISO 8601).

All dates and times are in UTC.

#### Known issues

* When people go online using the web version and don't visit other pages over
time (for example, just listening to music), they appear offline.
Hence the 0:00:00 durations you might sometimes encounter.
This might also happen using other clients.

[matplotlib]: http://matplotlib.org/
[track_status.py]: #track_statuspy

[user.png]: img/online_duration/user.png
[date.png]: img/online_duration/date.png
[weekday.png]: img/online_duration/weekday.png
[hour.png]: img/online_duration/hour.png

### mutual_friends.py

Learn who your ex and her new boyfriend are both friends with.

    usage: mutual_friends.py [-h] [--output-format {csv,json}] UID [UID ...]

For example (using made up user IDs/"screen names"),

    > mutual_friends.py john.doe jane.doe
    89497105,John,Smith,john.smith
    3698577,Jane,Smith,jane.smith

In the example above, both "John Doe" and "Jane Doe" are friends with "John
Smith" and "Jane Smith", whose user IDs are 89497105 and 3698577 respectively.
Their "screen names" (the part after "vk.com/" of their personal page URLs) are
"john.smith" and "jane.smith".

The output format is CSV (comma-separated values) by default.
You can also get a JSON document:

    > mutual_friends.py --output-format json john.doe jane.doe
    [
       {
          "uid": 89497105,
          "first_name": "John",
          "last_name": "Smith",
          "screen_name": "john.smith"
       },
       {
          "uid": 3698577,
          "first_name": "Jane",
          "last_name": "Smith",
          "screen_name": "jane.smith"
       }
    ]

License
-------

This project, including all of the files and their contents, is licensed under
the terms of the MIT License.
See [LICENSE.txt] for details.

[LICENSE.txt]: LICENSE.txt