JSON Stream is a tool for analysing streams of jsons. Input file needs to have json in each line, eg:
$ cat records
{"person":{"name":"John", "age": 23}}
{"person":{"name":"Alice", "height": 162}}
{"person":{"name":"Bob", "age": 23, "height": 180}}
js <mode> [file path]
Modes:
- keys - find all keys and count how many times each occurs
- enums - find all unique values for each key
- enum_stats - calculate how many times each value for a given key occurs
$ js keys records
{
"person": {
"age": 2,
"height": 2,
"name": 3
}
}
$ js enums records
{
"person": {
"age": 23,
"height": [162, 180],
"name": ["Bob", "Alice", "John"]
}
}
$ js enum_stats records
{
"person": {
"age": {
"23": "33.3%"
},
"height": {
"162": "33.3%"
},
"name": [
{
"John": "33.3%"
},
{
"Alice": "33.3%"
},
{
"Bob": "33.3%"
}
]
}
}
To analyse file
using mode keys
do:
<file docker run -i relar/jsonstream js keys
but reading from stdin is slower than reading from file, for bigger jobs mount volume and use reading from file
- install elixir
- download dependencies -
mix deps.get
- build binary -
mix escript.build
- you have binary
js
ready to go - pass stdin or filename:
js keys records.json