jsongrep is a command-line tool providing a JSONPath-inspired query language
for searching and filtering JSON, YAML, TOML, CBOR, and MessagePack data.
It supports wildcards, recursive descent, array slicing, filter expressions,
and regex matching.
README
jsongrep is a command-line tool and Rust library for
fast querying
of JSON, YAML, TOML, JSONL, CBOR, and MessagePack documents using regular path expressions.
JSON documents are trees: objects and arrays branch into nested values, with
edges labeled by field names or array indices. jsongrep lets you describe
sets of paths through this tree using regular expression operators - the
same way you'd match patterns in text.
**.name # Kleene star: match "name" under nested objects
users[*].email # Wildcard: all emails in the users array
(error|warn).* # Disjunction: any field under "error" or "warn"
(* | [*])*.name # Any depth: match "name" through both objects and arrays
This is different from tools like jq, which use a filter pipeline to transform
data. With jsongrep, you declare what paths to match rather than describing
how to transform. The query compiles to a
DFA that
processes the document efficiently.
See the blog post for the motivation
and design behind jsongrep.
jsongrep vs jq
jq is a powerful tool, but its filter syntax can be verbose for common
path-matching tasks. jsongrep is declarative: you describe the shape of the
paths you want, and the engine finds them.
Find a field at any depth:
jsongrep: -F treats the query as a literal field name at any depth:
curl -s https://api.nobelprize.org/v1/prize.json | jg -F firstname | head -6
jsongrep also shows where each match was found (e.g.,
prizes.[0].laureates.[0].firstname:), which jq does not. (Examples below
show terminal output; when piped, path headers are hidden by default. See
--with-path / --no-path.)
Test data ranges from a small sample JSON to a 190 MB GeoJSON file
(citylots.json), with queries
chosen to exercise equivalent functionality across tools (recursive descent,
wildcards, nested paths). Where a tool lacks a feature, the benchmark is
skipped rather than faked.
jg natively supports multiple serialization formats. Non-JSON formats are
converted to JSON at the boundary, then queried with the same engine, so your
queries work identically regardless of input format.
Explicit format flag (useful for stdin or non-standard extensions):
cat config.yaml | jg -f yaml 'database.host'
database.host:
"localhost"
Binary formats (CBOR, MessagePack):
jg 'name' data.cbor
jg -f msgpack 'name' data.bin
Format
Extensions
Feature flag
Notes
JSON
.json (default)
—
Always available
JSONL/NDJSON
.jsonl, .ndjson
—
Wrapped into JSON array
YAML
.yaml, .yml
yaml
Included by default
TOML
.toml
toml
Included by default
CBOR
.cbor
cbor
Included by default
MessagePack
.msgpack, .mp
msgpack
Included by default
All format dependencies are included by default. To build without them:
cargo install jsongrep --no-default-features
CLI Usage
JSONPath-inspired query language for JSON, YAML, TOML, and other serialization formats
Usage: jg [OPTIONS] [QUERY] [FILE] [COMMAND]
Commands:
generate Generate additional documentation and/or completions
Arguments:
[QUERY] Query string (e.g., "**.name")
[FILE] Optional path to file. If omitted, reads from STDIN
Options:
-i, --ignore-case Case insensitive search
--compact Do not pretty-print the JSON output
--count Display count of number of matches
--depth Display depth of the input document
--porcelain Machine-readable output: strip labels and colors (useful for piping)
-n, --no-display Do not display matched JSON values
-F, --fixed-string Treat the query as a literal field name and search at any depth
--with-path Always print the path header, even when output is piped
--no-path Never print the path header, even in a terminal
-f, --format Input format (auto-detects from file extension if omitted) [default: auto] [possible values: auto, json, jsonl, yaml, toml, cbor, msgpack]
-h, --help Print help (see more with '--help')
-V, --version Print version
More CLI Examples
Search for a literal field name at any depth:
curl -s https://api.nobelprize.org/v1/prize.json | jg -F motivation | head -4
By default, path headers display in terminals and hide when output is piped
(like ripgrep's --heading). This makes piping to sort, uniq, etc., work
cleanly:
Queries are regular expressions over paths. If you know regex, this will
feel familiar:
Operator
Example
Description
Sequence
foo.bar.baz
Concatenation: match path foo → bar → baz
Disjunction
foo | bar
Union: match either foo or bar
Kleene star
**
Match zero or more field accesses
Repetition
foo*
Repeat the preceding step zero or more times
Wildcards
* or [*]
Match any single field or array index
Optional
foo?.bar
Optional foo field access
Field access
foo or "foo bar"
Match a specific field (quote if spaces)
Array index
[0] or [1:3]
Match specific index or slice (exclusive end)
These queries can be arbitrarily nested with parentheses. For example,
foo.(bar|baz).qux matches foo.bar.qux or foo.baz.qux.
This also means that you can recursively descend any path with (* | [*])*,
e.g., (* | [*])*.foo to find all paths matching foo field at any
depth.
The query engine compiles expressions to an
NFA, then
determinizes to a
DFA for
execution. See the grammar directory and the
query module for implementation details.
> Experimental: The grammar supports /regex/ syntax for matching field
> names by pattern, but this is not yet fully implemented. Determinizing
> overlapping regexes (e.g., /a/ vs /aab/) requires subset construction
> across multiple patterns - planned but not complete.
Library Usage
Add to your Cargo.toml:
[dependencies]
jsongrep = "0.9.0"
Query with a one-liner:
let json: jsongrep::Value = serde_json::from_str(r#"{"users": [{"name": "Alice"}]}"#)?;
let results = jsongrep::grep(&json, "users[*].name")?;
for result in &results {
println!("{:?}: {}", result.path, result.value);
}
For repeated queries, compile the DFA once and reuse it:
use jsongrep::query::QueryDFA;
let dfa = QueryDFA::from_query_str("users[*].name")?;
let results = dfa.find(&json);
Build queries programmatically with QueryBuilder:
use jsongrep::query::{QueryBuilder, QueryDFA};
let query = QueryBuilder::new()
.field("users")
.array_wildcard()
.field("name")
.build();
let dfa = QueryDFA::from_query(&query);
let results = dfa.find(&json);