TileDB-Segy is a small MIT licensed Python library for easy interaction with seismic data, powered by TileDB. It combines an intuitive, segyio-like API with a powerful storage engine.
- Converting from SEG-Y and Seismic Unix formatted seismic data to TileDB arrays.
- Simple and powerful read-only API, closely modeled after
segyio
. - 100% unit test coverage.
- Fully type-annotated.
- API for write operations.
- Converting back to SEG-Y.
- TileDB configuration and performance tuning.
- Comprehensive documentation.
- Real-world usage.
TileDB-Segy can be installed:
-
from PyPI by
pip
:pip install tiledb-segy
-
from source by cloning the Git repository:
git clone https://github.com/TileDB-Inc/TileDB-Segy.git cd TileDB-Segy pip install .
You may run the test suite with:
python setup.py test
TileDB-Segy comes with a commandline interface (CLI) called segy2tiledb
for converting
SEG-Y and Seismic Unix formatted files to TileDB formatted arrays. At minimum it takes
an input file and generates a directory at the same parent directory with the input and
extension .tsgy
:
$ segy2tiledb a123.segy
$ du -sh a123.*
73M a123.sgy
55M a123.tsgy
To see the full list of options run:
$ segy2tiledb -h
usage: segy2tiledb [-h] [-o] [-g {auto,structured,unstructured}] [--su]
[--iline ILINE] [--xline XLINE]
[--endian {big,msb,little,lsb}] [-s TILE_SIZE]
input [output]
Convert a SEG-Y file to tiledb-segy format
positional arguments:
input Input SEG-Y file path
output Output directory path (default: None)
optional arguments:
-h, --help show this help message and exit
-o, --overwrite Overwrite the output directory if it already exists (default: False)
-g {auto,structured,unstructured}, --geometry {auto,structured,unstructured}
Output geometry:
- auto: same as the input SEG-Y.
- structured: same as `auto` but abort if a geometry cannot be inferred.
- unstructured: opt out on building geometry information.
(default: auto)
segyio options:
--su Open a seismic unix file instead of SEG-Y (default: False)
--iline ILINE Inline number field in the trace headers (default: 189)
--xline XLINE Crossline number field in the trace headers (default: 193)
--endian {big,msb,little,lsb}
File endianness, big/msb (default) or little/lsb (default: big)
tiledb options:
-s TILE_SIZE, --tile-size TILE_SIZE
Tile size in bytes.
Larger tile size improves disk access time at the cost of higher memory (default: 4000000)
TileDB-Segy generally follows the segyio
API; you may consult its
documentation to learn about
the public attributes (ilines
, xlines
, offsets
, samples
) and addressing modes
(trace
, header
, attributes
', iline
, xline
, fast
, slow
, depth_slice
,
gather
, text
, bin
).
You can find usage examples in the following Jupyter notebooks:
-
Addressing modes that return a generator of numpy arrays in
segyio
, intiledb-segy
they return a single numpy array of higher dimension. For example, in a SEG-Y with 50 ilines, 20 xlines, 100 samples, and 3 offsets:f.iline[0:5]
:segyio
returns a generator that yields 5 2D numpy arrays of (20, 100) shapetiledb-segy
returns a 3D numpy array of (5, 20, 100) shape
f.iline[0:5, :]
:segyio
returns a generator that yields 15 2D numpy arrays of (20, 100) shapetiledb-segy
returns a 4D numpy array of (5, 3, 20, 100) shape
-
The mappings returned by
bin
,header
andattributes(name)
have string keys instead ofsegyio.TraceField
enums or integers. -
tiledb.segy.open(dir_path)
, thesegyio.open(file_path)
equivalent, does not take any optional parameters (e.g.strict
orignore_geometry
). -
Unstructured and structured SEG-Y are represented as instances of two different classes,
tiledb.segy.Segy
andtiledb.segy.StructuredSegy
respectively.StructuredSegy
extendsSegy
, so the whole unstructured API is inherited by the structured.- All attributes and addressing modes specific to structured files (e.g.
ilines
orgather
) are available only toStructuredSegy
. In contrastsegyio
returnsNone
or raises an exception if these properties are accessed on unstructured files. segyio.tools.dt
is exposed asSegy.dt(fallback=4000.0)
method.segyio.tools.cube
is exposed asStructuredSegy.cube()
method.- There is no
unstructured
attribute; usenot isinstance(f, StructuredSegy)
instead.
-
There is no
tracecount
attribute; uselen(trace)
instead. -
There is no
ext_headers
attribute; uselen(text[1:])
instead.