I'm coming around to sticking with RDS, but maybe switching to PostgreSQL with its native JSON functionality. Still, I wish there was a simple JSON document database for AWS, which was backed by S3 storage, available as a service.
Comments (8)
Chris Moyer
via
Twitter
Athena? Or S3 Select?
Jonathan LaCour
via
Twitter
Close, but not quite the right fit. S3 Select is more suited for searching within a single object. Athena is closer, but it’s interaction model is a little off, as it sends results of queries off to a bucket.
Jonathan LaCour
via
Twitter
What I really want is to be able to list objects in a bucket that match a set of object tags.
Joe Harris
via
Twitter
Have you tried Redshift Spectrum? You to query JSON data (and other formats) directly from S3 - no loading. Mix and match external S3 data with Redshift data. Performance is excellent.
Docs: docs.aws.amazon.com/redshift/lates…
Customer blog: aws.amazon.com/blogs/big-data…
Jonathan LaCour
via
Twitter
I'm familiar, and its an amazing service, but I think its overkill for my use case (and many use cases). It requires a RedShift Cluster 24/7, which incurs cost. I really want to have something that is entirely consumption based, otherwise I'd just use RDS + PostgreSQL.
Jonathan LaCour
via
Twitter
If S3 had a feature to list objects based upon the new-ish object tagging features, that would enable a whole new class of application on top of S3. Essentially a basic object database with object classification/taxonomies.
Joe Harris
via
Twitter
You don't have to keep the cluster up 24/7 if you only issue queries at certain times. Clusters come up and down in just a few minutes. External table metadata is stored in Glue catalog, so persists without a cluster. We announced at Re:Invent that we will supported nested data.
Jonathan LaCour
via
Twitter
My use case is to query from a Lambda function that is exposed via API Gateway as an HTTP service to back a CMS. Needs to be always available instantly, which means I'd pretty much have to leave it running.
Comments (8)
Athena? Or S3 Select?
Close, but not quite the right fit. S3 Select is more suited for searching within a single object. Athena is closer, but it’s interaction model is a little off, as it sends results of queries off to a bucket.
What I really want is to be able to list objects in a bucket that match a set of object tags.
Have you tried Redshift Spectrum? You to query JSON data (and other formats) directly from S3 - no loading. Mix and match external S3 data with Redshift data. Performance is excellent.
Docs: docs.aws.amazon.com/redshift/lates…
Customer blog: aws.amazon.com/blogs/big-data…
I'm familiar, and its an amazing service, but I think its overkill for my use case (and many use cases). It requires a RedShift Cluster 24/7, which incurs cost. I really want to have something that is entirely consumption based, otherwise I'd just use RDS + PostgreSQL.
If S3 had a feature to list objects based upon the new-ish object tagging features, that would enable a whole new class of application on top of S3. Essentially a basic object database with object classification/taxonomies.
You don't have to keep the cluster up 24/7 if you only issue queries at certain times. Clusters come up and down in just a few minutes. External table metadata is stored in Glue catalog, so persists without a cluster. We announced at Re:Invent that we will supported nested data.
My use case is to query from a Lambda function that is exposed via API Gateway as an HTTP service to back a CMS. Needs to be always available instantly, which means I'd pretty much have to leave it running.