Debugging
We’re overhauling Dgraph’s docs to make them clearer and more approachable. If you notice any issues during this transition or have suggestions, please let us know.
Each Dgraph data node exposes profile over /debug/pprof
endpoint and metrics
over /debug/vars
endpoint. Each Dgraph data node has it’s own profiling and
metrics information. Below is a list of debugging information exposed by Dgraph
and the corresponding commands to retrieve them.
Metrics Information
If you are collecting these metrics from outside the Dgraph instance you need to
pass --expose_trace=true
flag, otherwise there metrics can be collected by
connecting to the instance over localhost.
Metrics can also be retrieved in the Prometheus format at
/debug/prometheus_metrics
. See the Metrics section for the full
list of metrics.
Profiling Information
Profiling information is available via the go tool pprof
profiling tool built
into Go. The
“Profiling Go programs” Go blog
post should help you get started with using pprof. Each Dgraph Zero and Dgraph
Alpha exposes a debug endpoint at /debug/pprof/<profile>
via the HTTP port.
The output of the command would show the location where the profile is stored.
In the interactive pprof shell, you can use commands like top
to get a listing
of the top functions in the profile, web
to get a visual graph of the profile
opened in a web browser, or list
to display a code listing with profiling
information overlaid.
CPU profile
Memory profile
Block profile
Dgraph by default doesn’t collect the block profile. Dgraph must be started with
--profile_mode=block
and --block_rate=<N>
with N > 1.
Goroutine stack
The HTTP page /debug/pprof/
is available at the HTTP port of a Dgraph Zero or
Dgraph Alpha. From this page a link to the “full goroutine stack dump” is
available (for example, on a Dgraph Alpha this page would be at
http://localhost:8080/debug/pprof/goroutine?debug=2
). Looking at the full
goroutine stack can be useful to understand goroutine usage at that moment.
Profiling Information with debuginfo
Instead of sending a request to the server for each CPU, memory, and goroutine
profile, you can use the debuginfo
command to collect all of these profiles,
along with several metrics.
You can run the command like this:
Your output should look like:
When the command finishes, debuginfo
returns the tarball’s file name. If no
destination has been specified, the file is created in the same directory from
where you ran the debuginfo
command.
The following files contain the metrics collected by the debuginfo
command:
Command parameters
The metrics flag (-m
)
By default, debuginfo
collects:
heap
cpu
state
health
jemalloc
trace
metrics
vars
trace
goroutine
block
mutex
threadcreate
If needed, you can collect some of them (not necessarily all). For example, this
command collects only jemalloc
and health
profiles:
Profiles details
-
cpu profile
: CPU profile determines where a program spends its time while actively consuming CPU cycles (as opposed to while sleeping or waiting for I/O). -
heap
: Heap profile reports memory allocation samples; used to monitor current and historical memory usage, and to check for memory leaks. -
threadcreate
: Thread creation profile reports the sections of the program that lead the creation of new OS threads. -
goroutine
: Goroutine profile reports the stack traces of all current goroutines. -
block
: Block profile shows where goroutines block waiting on synchronization primitives (including timer channels). -
mutex
: Mutex profile reports the lock contentions. When you think your CPU isn’t fully utilized due to a mutex contention, use this profile. -
trace
: this capture a wide range of runtime events. Execution tracer is a tool to detect latency and utilization problems. You can examine how well the CPU is utilized, and when networking or syscalls are a cause of preemption for the goroutines. Tracer is useful to identify poorly parallelized execution, understand some of the core runtime events, and how your goroutines execute.
Using the debug
tool
To debug a running Dgraph cluster, first copy the postings (“p”) directory to
another location. If the Dgraph cluster isn’t running, then you can use the same
postings directory with the debug tool. If the “p” directory has been encrypted,
then the debug tool needs to use the --keyfile <path-to-keyfile>
option. This
file must contain the same key that was used to encrypt the “p” directory.
The dgraph debug
tool can be used to inspect Dgraph’s posting list structure.
You can use the debug tool to inspect the data, schema, and indices of your
Dgraph cluster.
Some scenarios where the debug tool is useful:
- Verify that mutations committed to Dgraph have been persisted to disk.
- Verify that indices are created.
- Inspect the history of a posting list.
- Parse a badger key into meaningful struct
Example
Debug the p directory.
Debug the p directory, not opening in read-only mode. This is typically necessary when the database wasn’t closed properly.
Debug the p directory, only outputting the keys for the predicate 0-name
. Note
that 0 is the namespace and name is the predicate.
Debug the p directory, looking up a particular key:
Debug the p directory, inspecting the history of a particular key:
Debug an encrypted p directory with the key in a local file at the path ./key_file:
The key file contains the key used to decrypt/encrypt the db. This key should be kept secret. As a best practice,
-
Don’t store the key file on the disk permanently. Back it up in a safe place and delete it after using it with the debug tool.
-
If the this isn’t possible, make sure correct privileges are set on the key file. Only the user who owns the dgraph process should be able to read or write the key file:
chmod 600
Debug tool output
Let’s go over an example with a Dgraph cluster with the following schema with a term index, full-text index, and two separately committed mutations:
After stopping Dgraph, you can run the debug tool to inspect the postings directory:
The debug output can be very large. Typically you would redirect the debug tool to a file first for easier analysis.
Each line in the debug output contains a prefix indicating the type of the key:
{d}
: data key{i}
: index key{c}
: count key{r}
: reverse key{s}
: schema key
In the preceding debug output, we see data keys, index keys, and schema keys.
Each index key has a corresponding index type. For example, in
attr: name term: [1] [dgraph]
the [1]
shows that this is the term index
(0x1). In attr: description term: [8] [fast]
, the [8]
shows that
this is the full-text index (0x8). These IDs match the index IDs
in tok.go.
Key lookup
Every key can be inspected further with the --lookup
flag for the specific
key.
For data keys, a lookup shows its type and value. Below, we see that the key for
attr: url uid: 1
is a string value.
For index keys, a lookup shows the UIDs that are part of this index. Below, we
see that the fast
index for the <description>
predicate has UIDs 0x1 and
0x2.
Key history
You can also look up the history of values for a key using the --history
option.
Above, we see that UID 0x1 was committed to this index at ts 5, and UID 0x2 was committed to this index at ts 7.
The debug output also shows UserMeta information:
{complete}
: Complete posting list{uid}
: UID posting list{delta}
: Delta posting list{empty}
: Empty posting list{item}
: Item posting list{deleted}
: Delete marker
Parse key
You can parse a key into its constituent components using --parse_key
. This
doesn’t require a p directory.
Was this page helpful?