Quilt provides support for queries in the ElasticSearch DSL, as well as SQL queries in Athena.
The objects in S3 buckets connected to Quilt are synchronized to an ElasticSearch cluster, which provides Quilt's search features. For custom queries, you can use the Queries tab in the Quilt catalog to directly query ElasticSearch cluster.
Quilt maintains a near-realtime index of the objects in your S3 bucket in ElasticSearch. Each bucket corresponds to one or more ElasticSearch indexes. As objects are mutated in S3, Quilt uses an event-driven system (via SNS and SQS) to update ElasticSearch.
There are two types of indexing in Quilt:
shallow indexing includes object metadata (such as the file name and size)
deep indexing includes object contents. Quilt supports deep indexing for the following file extensions:
.ipynb (Jupyter notebooks)
.html, .txt, .tsv, .csv, .md (plus many other plain-text formats)
Quilt ElasticSearch queries support the following keys:
index — comma-separated list of indexes to search (learn more)
filter_path — to reducing response nesting, (learn more)
_source — boolean that adds or removes the _source field, or a list of fields to return (learn more)