-
Notifications
You must be signed in to change notification settings - Fork 394
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
cmd-ref: document data:status command (#3812)
* cmd-ref: document data:status command * Update content/docs/sidebar.json * Apply suggestions from code review Co-authored-by: Dave Berenbaum <[email protected]> Co-authored-by: David de la Iglesia Castro <[email protected]> * apply suggestions * apply suggestions * fix options * rewrite description * Apply suggestions from code review Co-authored-by: Dave Berenbaum <[email protected]> * Restyled by prettier (#3901) Co-authored-by: Restyled.io <[email protected]> * reword granular * Restyled by prettier (#3912) Co-authored-by: Restyled.io <[email protected]> * fix review suggestions * change data index.md * data: remove index * sync to latest dvc-data-status outputs Co-authored-by: Jorge Orpinel <[email protected]> Co-authored-by: Dave Berenbaum <[email protected]> Co-authored-by: David de la Iglesia Castro <[email protected]> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <[email protected]>
- Loading branch information
1 parent
5ed2617
commit 62a5c97
Showing
2 changed files
with
146 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# data status | ||
|
||
Show changes in the data tracked by DVC in the workspace. | ||
|
||
## Synopsis | ||
|
||
```usage | ||
usage: dvc data status [-h] [-q | -v] | ||
[--granular] [--unchanged] | ||
[--untracked-files [{no,all}]] | ||
[--json] | ||
``` | ||
|
||
## Description | ||
|
||
The `data status` command displays the state of the working directory and the | ||
changes with respect to the last Git commit (`HEAD`). It shows you what new | ||
changes have been committed to DVC, which haven't been committed, which files | ||
aren't being tracked by DVC and Git, and what files are missing from the | ||
<abbr>cache</abbr>. | ||
|
||
The `dvc data status` command only outputs information, it won't modify or | ||
change anything in your working directory. It's a good practice to check the | ||
state of your repository before doing `dvc commit` or `git commit` so that you | ||
don't accidentally commit something you don't mean to. | ||
|
||
An example output might look something like follows: | ||
|
||
```dvc | ||
$ dvc data status | ||
Not in cache: | ||
(use "dvc fetch <file>..." to download files) | ||
data/data.xml | ||
DVC committed changes: | ||
(git commit the corresponding dvc files to update the repo) | ||
modified: data/features/ | ||
DVC uncommitted changes: | ||
(use "dvc commit <file>..." to track changes) | ||
(use "dvc checkout <file>..." to discard changes) | ||
deleted: model.pkl | ||
(there are other changes not tracked by dvc, use "git status" to see) | ||
``` | ||
|
||
As shown above, the `dvc data status` displays changes in multiple categories: | ||
|
||
- _Not in cache_ indicates that the hash for files are recorded in `dvc.lock` | ||
and `.dvc` files but the corresponding cache files are missing. | ||
- _DVC committed changes_ indicates that there are changes that are | ||
`dvc-commit`-ed that differs with the last Git commit. There might be more | ||
detailed state on how each of those files changed: _added_, _modified_, | ||
_deleted_ and _unknown_. | ||
- _DVC uncommitted changes_ indicates that there are changes in the working | ||
directory that are not `dvc commit`-ed yet. Same as _DVC committed changes_, | ||
there might be more detailed state on how each of those files changed. | ||
- _Untracked files_ shows the files that are not being tracked by DVC and Git. | ||
This is disabled by default, unless [`--untracked-files`](#--untracked-files) | ||
is specified. | ||
- _DVC Unchanged files_ shows the files that are not changed. This is not shown | ||
by default, unless [`--unchanged`](#--unchanged) is specified. | ||
|
||
By default, `dvc data status` does not show individual changes inside the | ||
tracked directories, which can be enabled with [`--granular`](#--granular) | ||
option. | ||
|
||
## Options | ||
|
||
- `--granular` - show granular, file-level information of the changes for | ||
DVC-tracked directories. By default, `dvc data status` does not show | ||
individual changes for files inside the tracked directories. | ||
|
||
- `--untracked-files` - show files that are not being tracked by DVC and Git. | ||
|
||
- `--unchanged` - show unchanged DVC-tracked files. | ||
|
||
- `--json` - prints the command's output in easily parsable JSON format, instead | ||
of a human-readable output. | ||
|
||
- `-h`, `--help` - prints the usage/help message, and exit. | ||
|
||
- `-q`, `--quiet` - do not write anything to standard output. | ||
|
||
- `-v`, `--verbose` - displays detailed tracing information. | ||
|
||
## Examples | ||
|
||
```dvc | ||
$ dvc data status | ||
Not in cache: | ||
(use "dvc fetch <file>..." to download files) | ||
data/data.xml | ||
DVC committed changes: | ||
(git commit the corresponding dvc files to update the repo) | ||
modified: data/features/ | ||
DVC uncommitted changes: | ||
(use "dvc commit <file>..." to track changes) | ||
(use "dvc checkout <file>..." to discard changes) | ||
deleted: model.pkl | ||
(there are other changes not tracked by dvc, use "git status" to see) | ||
``` | ||
|
||
This shows that the `data/data.xml` is missing from the cache, `data/features/` | ||
a directory, has changes that are being tracked by DVC but is not Git committed | ||
yet, and a file `model.pkl` has been deleted from the workspace. The | ||
`data/features/` directory is modified, but there is no further details to what | ||
changed inside. The `--granular` option can provide more information on that. | ||
|
||
## Example: Granular output | ||
|
||
Following on from the above example, using `--granular` will show file-level | ||
information for the changes: | ||
|
||
```dvc | ||
$ dvc data status --granular | ||
Not in cache: | ||
(use "dvc fetch <file>..." to download files) | ||
data/data.xml | ||
DVC committed changes: | ||
(git commit the corresponding dvc files to update the repo) | ||
added: data/features/foo | ||
DVC uncommitted changes: | ||
(use "dvc commit <file>..." to track changes) | ||
(use "dvc checkout <file>..." to discard changes) | ||
deleted: model.pkl | ||
(there are other changes not tracked by dvc, use "git status" to see) | ||
``` | ||
|
||
Now there's more information in _DVC committed changes_ regarding the changes in | ||
`data/features`. From the output, it shows that there is a new file added to | ||
`data/features`: `data/features/foo`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters