Spanner, Firestore, and Bigtable: Connecting the Dots (Cloud Next '19)

Spanner, Firestore, and Bigtable: Connecting the Dots (Cloud Next '19)

Show Video

My. Name is Fred wolf this is our Pitta and. We are from streak which, is a start-up. Of about 40 people based here in San Francisco. And, yeah we're basically this. Talk we're really talking about two things one. Is this, host of managed. Databases, on Google cloud and. The second is sort of how we're building, our business on top of them and. So it's probably helpful to start with like what our business what our product is. So. Basically. Streak, is an organizational, tool on top of Gmail so the. Way this works behind the scenes is we ship a Chrome extension that just modifies. The Gmail UI to, add in sort of organization. On top of it we, have what's called a pipeline which might be if you're in sales might, be a list of your sales deals and the stage they're in whether they're leads whether they're closed that, kind of thing if you're in support might. Be support, tickets, you. Know people use this for hiring business, development, basically, the, key thing here is it's organization on top of email and like why is that important, so, there's. A bunch of different tools within, your organization, for, sort. Of making. An organization, better there's things like hangouts, chatter slack there's, things like project management tools but, once you get outside of an organization, basically, everybody, uses email just because everybody, has it and. So this. The. Key things that streak adds so. We have what's called a pipeline and, you add boxes which might be sales deals or support, tickets, and. In, addition, to just like organizing your emails so it when you look at your emails in your inbox, you can, sort. Of see which deals, or support tickets they're associated with. We can also add sort, of structured, metadata so things like you. Know what stage the deals in, who's. Involved, who's like the point of contact how long it's been since the last follow-up, and. All of that kind of stuff so, we're moves to talk about technology here rather than a product demo but I just sort of want to give, like a motivating, example of sort. Of what we're storing in these databases. Well. And, so here we're gonna be talking about our database, journey and just to give you a like a little bit of preview sort, of situate, everything, we, started off on the cloud datastore. We're. Going to be talking about cloud fire store which is datastore, sort of spiritual successor, and. Then we're gonna focus in on sort. Of one particular problem that we call the email metadata, problem, and. How we're building some new product features that caused us to reach for cloud. Spanner and cloud BigTable, sort, of some of the newer hosted. Database technologies. So. Now I want to give it over to our pita to, chat about our use of cloud, datastore and cloud fire store. All. Right that's. A cut datastore. The, way we're going to be talking about the databases is just give you a little bit of an introduction why. We chose that database service, our. Takeaways, from using that service and how. Are our. Data needs evolved, to, want, a newer solution. So. Datastore, it's a no sequel document store your. Data is organized, into documents, call. Entities, which. Are basically objects, that are accessed, via. A key or. An index value and. The. Entities similar, entities are grouped as kinds. Which is like, a table in a relational model. Entities. Have fields. Called, properties, and. Which. Are basically the fields, the, values you, get in built to and built indexes. On each of the properties for an entity. There. Are no joins, or aggregate queries for datastore. There's the gql. Which is a query language for for. Datastore but the. Complex are joints. And aggregate, operations. Are not supported, but. There's a simple filtering, is. It's. Possible, why. We chose it so back, when streak was started it. Was started on App Engine so, the. Only database, offering at the time was, on Google Cloud was datastore. So it made sense to co-locate. Our database with our server stack on. Google Cloud. But, even so we, think it's an excellent choice for if.

You're Starting out with a new product or service and, that. Is because it, gives you a lot of flexibility, to. Evolve. Your schema and what. That means what data store is with. Entities as. I said they have different properties you. Have you, can have the. Same properties, across entities, can be completely, different the number of properties, the, even, the data types for the same properties, can be different so it gives you a lot of freedom to, just. Just build, your features and evolve. Your schema, as you go of, course, the other reason. We. We love datastore is it's, a managed service no, ops it Auto scales Auto, replicates. Google's. SRS, handle, availability. You. Don't have to worry about provisioning. Capacity, upfront, and you. Just really just throw data. At it and it would just grow with your service and. So. Back. That back then there was one candidate, and. Over. Ten years later we have about. 15 terabytes of data on datastore all our application, data is on, it and we. Found it, to be do. Perform predictably, and have. Predictable, kind, of challenges. And solutions. So. Working with datastore. So. Some concepts, that we found our key to. To. Work most effectively with, datastore one, is entity, groups so. As I was saying your. Data is organized into entities, and. Multiple. Entities if they, belong to one entity group. You. Can have a transaction, across. Multiple entities so. You, basically want to have your most. Highly. Correlated data, in. One entity group so it can be a user profile, and so, on at Streak, we have the. Smallest unit, of work is a box so, a box, can have different. Threads of emails notes, comments, and a, lot of other custom fields. And. So. A boxes. Can be put. Into pipelines which, are shared across teams, and the. Teams belong to an organization, so. The, way you design, your, energy, groups will, have consequences. On throughput. And consistency, so. With. For our case we have four if you want to update boxes. We have our Energy Group set to organization. Level which is which means we have a lot. A lower number of energy groups what. That does is. So. Energy groups are also units. Of consistency, which means a, query. Against a energy group will always give you strongly, consistent, results. Also. With. The transaction. They will always you. Know your asset transaction. Properties will be applied when. You're using energy, groups they're. Also stored, all entities within an entity, group Google stores them physically. Close, together. So. Your. Minimal i/o reads, are really fast. So. When, in, our example, with boxes, when, the energy group is set to an organization, level with. One query, against. An entity, group you, get a high volume of. Entities. But. Their, limit their limits on, how. Many entity. Groups how many writes you can have per, entity group per. Second which, is set to you can only have one right, currently. In datastore per second per entity group and I'll. Talk about the, consequences. Of that in a second. The, other things that we found useful working. With datastore as, we, use objectify, which is a Java. Data access, API. That, that. Makes using. Datastore. Really easy the. Example. At the bottom where you have, annotations. To, work, with entities indexes. And so on has been really useful we, use JSON and, Jackson, annotations, for a serialization, and deserialization. Then. Some of the challenges. As. I. Was saying at the write.

Throughput Rate for entity groups is. In datastore is set to one per second, so, ideally. You want to have really small entity. Groups so that, you. Are not writing, more, than one per once per second, in. Our case. Organizations. Pipelines. Can have hundreds and thousands of boxes so for example if you're trying to update multiple. Boxes, in. Our case in a pipeline then, it, can be tens of thousands of writes against, one entity group so it can cause a lot of contention and. Again. Something to keep in mind entity. Group relationships, are are. Permanent you cannot change them unless, you delete all the entities and recreate, them in a new entity group. So. That is a challenge, contention. As we're growing we have larger and larger customers, that, has been a. Big. Challenge. Again. Boxes. The, way our schema. Has evolved, there's a lot of child objects, that, we have with, boxes and, now. That they all belong to the same entity group but. Transactions. Are limited to 500 entities. Per transaction, so when you have cascading, rights that, also becomes, kind of a performance bottleneck. Also. Another limit to keep in mind there's you, transactions. Can be are limited only to twenty-five entity, groups so again you want to figure. Out your key design so, that you're finding the right balance between. Your. Consistency, and throughput. And. Then, client, side views, so. As. You saw in the first graphic we have we, present this spreadsheet, layout, to your users and. We. We. Offer complex. Filtering. Grouping, and sorting and. Not. Being able to do that data. Store not providing, that out of the box we. Have, to shift this responsibility, to the application, code. Which of course increases. The complexity, and. Debugging. Complexity, and so on so, not having. Joints. As. We as we grow larger. We really, miss, having. All those, features, out of the box which will be which would be part of a relational, database. Cloud. Fire store. Fire. Store is the latest, latest. Version of, data. Store there's, two modes here real-time. And data store mode data. Store mode fire, store and data store mode is backwards. Compatible with cloud data store but, it doesn't have some, of the newer features that real-time. Fire. Store has like real-time, updates, there's rich, mobile and web client, libraries, so. Fire. Store is supposed to be a drop-in. Replacement, for data store and and. The. Migration. To it to, fire store is supposed to be automatic. Without any downtime so.

We're Really excited about that. Basically. Fire store has a spanner back in which, so with, the say it with the data store features in the front end so you get a strongly, consistent layer in the back end and. This eliminates, some of our biggest pain points with, data store one, we. Have strongly, consistent results, throughout and. The. Entity, group restrictions, that I talked about 25. Entity, groups per transaction, and the write throughput limit, of one per, second we, don't have we don't have that with fire store anymore we're. Hoping. To upgrade soon. And. Yeah. All. Right so. We talked about how we felt, the need for a fully. Relational, model for our application, data and there's. A new feature that. That. Would, work we worked on and. It. We call it the metadata problem, so, I'm. Sure all of you had this problem where. Somebody. Forgot to hit reply. All in an email and then, you got forward of that. Email and, then now you have a lot of redundant, emails and emails. Just in reverse chronological order and it's, just it's it's. Just not very elegant so. And. Also there. Can be problems. With permission sensitive, email that you were not supposed to receive. Ends. Up in your inbox so. Strict. Solution, to that is having. A unified unified, view of. Of. Emails, across. A team so, in this example you have the, top three emails were. Not addressed, to me but, as you can see they're shared from my co-workers inboxes. So, you get kind. Of this elegant view of the email and which, is controlled by a permission model so. This, looks really good in. The UI but. On the back end it's. Kind. Of a complex, graph. Traversal problem. So. We want to answer. Questions. Like. Which. Which. Co-workers. Of mine have. Interacted. With this, particular, prospect. Over. Email or. For. This email, what. Are the other emails on this thread that are not in my inbox but in my co-workers and boxes so. The, solution, to that we. Came up with with. A graph, solution. Where your. Nodes are either, a person, or an email an organization. Or a domain or. An. Email message and. The. The. Email, messages are connected, if they're, on the same thread, so, if I emailed. You. And then you forwarded the email to somebody else we can trace. Those emails and give you a unified view and then. Two emails, are also connected, if they are if. They are the same email, but they're the part, of different people's. Inboxes they're, connected, by something, that is called an RFC ID with with Gmail so. Same, message but different inboxes, so. This this obviously. Helps us to, answer the questions and create, a unified, view. That. Can be shared across a team to. Talk about how we, we. Approach this problem solving, this and the relational model needs, that. Friend great. Real. Quick before, I go on to our. Solution, I just want to yeah like build on what are Peter was saying so. This, is like a pretty challenging, graph, database problem like you have to really, understand. This, like sort of course of an email thread or a course of like connected email threads across, different, people's inboxes your organization, you, might have to hit hundreds, of different inboxes.

I'm Sure you've all been on these like horrible, reply-all threads, that have like you're not just hundreds of people on the company in the company but like you know thousands, of you. Know emails one should count although it'll you know some, email, chains. And. So there's. Kind of this graph database problem, but it's also pretty, challenging for a, couple reasons technically. So. One is you have to be able to do all of these edge traversals, quickly. You, there, may be a lot of you know emails are sort of nodes from the graph. Problem and as. Our pet I mentioned. Permissions. Are a huge deal right email can be really sensitive you could have password reset emails you can have manager. Giving you know targeted, feedback to a report, that. Might not be you. Know generally, consumable, they're just like all these ways they can go horribly wrong if, you don't have a strong permissions, model and. The kind of what this means is we. Have to traverse, this graph and. There's, pretty there's. Pretty big challenges if we try and pre compute a lot of this stuff you know that that's served normally the sort of distributed systems way out is you just like denormalize. A bunch of stuff in advance and. In this case we both you know really needed to compute it it read time to, make, sure that we're getting permissions, right even if somebody just changed, some, permissions, and. This, is all happening within the context of viewing emails in Gmail and you know the whole premise of Gmail is that it's really fast people, expect their emails to just like load, quickly even if they're searching their whole inbox and we're expanding that to like letting you search you. Know parts of not just your inbox but your everybody, in your company's inbox possibly. So. Challenging. Technical problem looking. At the backend this, is actually, across. Our entire user base we, have about, 30. Terabytes of just, email metadata, so that's like we're not looking at the contents of email we don't really want to but. Just looking at like who the emails from who it's to the. Sort of information we need to compute these edges. And so, we evaluated, a bunch of different database products, there's, a lot of graphs, specific, database products on the market, but. We really you. Know all else being equal we, wanted something that had this property where it was managed.

By Google we, have a very small operational, team we basically have roughly. Three infrastructure, engineers, depending on how you slice different people's time, and, being able to offload some of that like availability, and sharding work to Google. S re, really. Seems, like a good trade-off dummest so. We. Looked at a bunch of different options and, we, ended, up this, was our first use case for cloud, spanner. So cloud spanner I think it's about a year old at this point maybe, a little bit more but. The key things about it sort, of looking at just the product in general it. Is a, globally. Consistent relational. Database, and so what does that mean it, supports. You know the sequel language, that you you, don't know and possibly love or hate depending. On what, you've done with it and. It. Has what I affectionately call the magic. Clock thing and the. Magic. Clock is basically, you. Know for years and years and years sort, of there was always this sort of introductory paragraph, in every, distributed, systems paper which. Basically said something like we, know clocks are unreliable and, go back in time and have jumps, you know and the network's unreliable. And may have you know horrible latency. And. Everybody just sort of took it as a premise that you. Sort, of couldn't rely on clocks, and they were you know trying to lead you astray, and, so the the, spanner white, paper that came out a while back and now the cloud spanner product, is built on this idea of hey, what, if instead of like sort of accepting this as a premise we just made the clocks roughly, reliable, and, there's like a bunch of a bunch of effort in terms of installing in atomic clocks and all the data centers there's, a bunch of effort in terms of like setting up the networking and there's a bunch of effort in terms of the statistics, of like setting, error bars on timing but. The outcome, of it is the cloud. Spanner you know says on the tin and it's actually true like we're using it and anger it. Is globally, consistent. So. Basically, this means that you, know going back to the stuff, that are patted, about with, datastore. You don't have to pre-select, particular. Entity groups you don't have to say these, are the things that I'd like to do transactions, on beforehand, you, just when you go to actually update, entities, it looks at what, in these you've queried, and, you only have transactional. Conflicts, if, if. You. Are actually modifying, things that we wrought that were all so you know queried by other queries. So it's like just a set of stuff that our actual real transaction, conflicts. So. You. Know this, is great. Globally, consistent sequel. You. Know it, has. You know really great transactions, and, you know some of you who are sort. Of thinking this through might say you, know but wait you, know email itself isn't, consistent why do you care about consistency, you, know I might send you an email and, you know all the time we see. Emails. Taking you know ten minutes thirty minutes to get delivered why. Do we care about the database being consistent. And the answer is it's it goes back to that sort of operational. Ease it. Means that when we have our syncing logic, we. Can, you. Know sync the data we, can we can build these indexes, to power, these graph queries, and we, don't have to worry that, you. Know just think it's partially committed we, can just say hey all, of this should. Be committed in the transaction and it. Just. Like you. Know it's either all committed or not committed. And. This just means that our you. Know indexing pipeline that's, processing, a lot of email, was. Basically written by you. Know it, basically took one, engineer, about, half time working on it for, a month or so it was like really, for the scale of distributed, system, you. Know when we were planning out the work we're like hey we probably need to you, know have at least two people on this for a quarter and.

Just By you, know relying on the spanner transactionality, to deal, with a lot of the edge cases we. Were able to just like get it out and, get it you know in customers hands, much. More quickly. And. So. Building that operational, ease there's, a lot of other sort, of really nice features, in, that area around, spanner so, in. A previous life I've managed. A couple of like on premise sequel, databases and, sort, of all of those operational, things that just sort of fill DBAs with carer things. Like adding. Or removing columns adding. And removing indexes. Both. They're, like pretty easy to do and. And. And, they're also, there's. A pretty good sort, of, task. Planner that. Makes sure that even. If you add and remove a bunch of columns it doesn't interfere with your actual transactional, workload, so, that's you. Know all pretty. Nice things to have even, if our underlying data isn't necessarily, super. Dependent, on this strong consistency. And, so let's you, know I've been, singing its praises generally. Let's, talk about how it actually works in practice, so. This, is in. The cloud console, the, schema for one, of our tables. In production, and. This, table is what we call message mailboxes, and so that's if. You, remember the sort of graph nodes there were some things that are dependent on the, message itself you know which Gmail threads and, which. Other messages, have the same sort of message ID header to, figure. It out between. Inboxes. But, there's also stuff that's dependent on who. The message is. Addressed to or who it's from or who's on the CC list and. Message mailboxes, has. One row per person, on a message, so if I send a message to you the, message the version of the message in my inbox would. Have two message mail boxes rows one, for me and. One for you and. Key. Things here so, you see at the top it says that this this table, is interleaved, in messages. Interleaved. Tables is this feature in cloud, spanner that. Lets you sort of have really. Great performance for, stuff that you know should be co-located, together and so here most, of the time that we're querying. You. Know who's on a particular message we also care about what message they're on and so, here. We've interleaved, message mailboxes and messages and that means. If I want to do an index filter, on you. Know all the messages, that correspond, to a, particular email, address you know that a particular email address was part of so let's, say in the street product, we want to be able to show, all. Of your communication, with a particular. Sales Lead across the organization. I can query on mailbox, email and then do a very, quick join a very performant, join, with. The messages, table to get things like what was the subject of that email or like these other these other properties that sort. Of if you were doing a more. Traditional distributed, system you might have to denormalize and, waste a lot of space to, sort of get performant. And, then. And. So you. Know we use interleaved, rows, a bunch. We. One. Other interesting things so there's you'll see these sort of weird.

Is. BCC. Or is, CC. Boolean. Fields, at the bottom, and. So that's, basically saying hey, is this particular email address in the BCC line is, it in the CC line like how are they connected to this message and the. The. Interesting thing there those, rows actually, have two values which. You might expect from a boolean but. Those values are actually true, or null, and, so like why are we doing this. So spanner. Actually has this ability to have. What are called null filtered indexes, so if you're coming from like Postgres, or I'm sure Microsoft, sequel server has the similar thing you might be familiar with this concept of partial indexes. And. There that. Means that we can create an index that's just. You know is this person on. The from line or. Are, they on the to line and it doesn't actually use up any space for the. Message mailbox, rows they, don't have the, correspondent, boolean set. To true rather than null it, filters out all the null. Entries. So. Yeah so lessons, learned with cloud spanner setting, up your schema is super important particularly if you're putting a lot of data into it you. Know it's pretty easy to build and remove secondary, indexes but you are stuck with your primary. Key. Index. Like you have to rewrite, if you're changing your primary key it's not magic, like that, and. It's, pretty important to just sort of play around with, the schema. To. Figure. It out. So what challenges have we run into so far I've been like singing its praises and. Yeah you, know no plan survives contact with the enemy or the database. Query planner, so. One. Challenge we ran into is actually a. Spanner, is too smart for its own good for, our use case I guess your, your use case may vary, so. Spanner has a lot, of. Things. Or. Smarts. In the query planner to. Balance. Running. A lot of small transactional, queries so this is like the kind of thing that is, probably the you know ninety percent use case it. Does. Things like if you run, a really big query it assumes that this must be like an analytical, query or, maybe like a you, know back-end, bulk. Task indeed. It will will be, prioritize. That query and make sure that that query takes a long time to run so, it doesn't get in the way of your transactional, queries but. Going back to our use case as. We mentioned. The. You, know you've all been on these horrible huge, reply-all threads, and. A lot of times you, know fortunately or unfortunately those. Threads are the most important. Because. You know they're the ones where it's like the collaboration, story it really gets tricky and you need to be able to share some stuff but not other stuff and. We really, need those to be. Answerable. In, sort. Of a. Short. Enough amount of time that it doesn't, degrade the Gmail experience. And, so what. We were running into was when, we did the queries to span out to fan out on those really, big threads it, was you. Know not overwhelming, anything else but it was taking a long time and there wasn't really a great way to tell the query plan or hey. This query is really important you. Know go ahead and do it and we tried some experiments, with you. Know finding out if they need a bunch of queries we. Tried some experiments, with. You're doing like a little bit of cashing. In. Like, Redis or something like that using. Cloud, memory store, but. We just didn't find a thing that really delivered the user, experience, we, wanted. So. That led us to our next database that we're gonna chat about cloud.

BigTable. So, cloud, BigTable, is actually, it's, based on BigTable which, is like you, know right now at this point a you. Know very well-established Google, technology, it. Has it. Sort of has a lot less. Functionality. Than, cloud, span or cloud datastore it's, really just, a key value store or a sort of key value store let's say and. Basically, that means that there's really two queries you can do to it you, can query for this. Particular primary key give me the value or, you can do sort. Of lexicographic. I'll have, you put off that word range, queries, and. It. Basically say hey give me all values, for keys between, a and B or a, a and. A B, and. So there it's like a pretty limited query, interface but. The trade-off, you get here, is that. It's really really simple, and, really really fast and. So they're basically each node you add to cloud BigTable, and it's another managed, database you can just like add and remove nodes to, your heart's desire and you pay for more nodes get more performance, when. It's night time you can just like decrease, the number of nodes and, sort, of not pay for what you're not using, but. Basically. Each, node you add gives, you ten thousand reads or writes a second which is a lot of reads or writes a second, and. Basically, the, queries just look like either, put. Some data in for this key or read, some data for this key or set of keys. There's. Currently two modes the, mode that we use, for. This just. Because it's like for, our use case it's it's you know not. Great if the feature is unavailable, but it's not like business, ending people. Can still get it their action. Their streak box data which is stored safely in cloud. Datastore. By. There's one there's one mode, that is single zone availability, has, the concept of a transaction within a particular row. And. Then there's another mode that is multi, zone replication. You forego transactions. But even if one zone goes. Down which is infrequent. But happens you. Can just. Access, the data in another zone and that it'll a synchronously, replicate, after the zone comes back up. So, yeah so cloud BigTable not, a great tool, for every job but, the best tool for some jobs and we're actually using this in production for. Querying. This, email. Metadata, index, and. Let me show you sort of what that looks like so. Here. We've got the output of CBT. Is the, cloud, BigTable. Console. And, this, is our. Messages. Table and, you can see there's, this concept of a. A. Column. Family, which is sort of a set of data that's stored together. And. We, have one column family for sort of each of those indexes. That we need to first the graph, and. So they're you. Know for instance I was showing you the message mailboxes, table and so we've got one. Column family that's sort, of this index for, looking. Up message mailboxes, by the. Domain or, by the email. And. Then also messages. Are indexed by the message ID the RFC message ID or the thread ID which lets us do that sort of graph traversal, and. The cool thing here is, the. Big table API is like, completely asynchronous so, we basically when, we're doing this graph query we can just have a queue that's, just, as soon as information, comes back sending, out the next set of fetches. And that's all pipelined and can, be you know very much like sub-second so each. Lookup. Is about 6 milliseconds, so you can do a lot of lookups before, somebody notices the, delay. Other. Things here this, is very much sort of build it yourself so, the. Model in BigTable. Is, just a, string, of bytes to a string of bytes and. So. We had to you know figure out our own encoding. That does lexicographic, encoding, the way that we want there. There's a few libraries out there there's one called orderly. There's. One there's, lens it's sort of partially. Factored out at the HBase library, so. So. There's some tooling there but you still need to like very much figure out your own schema, in your own encoding, this. Was a, bit more developer, intensive, so. Definitely, don't recommend reaching, for this from. The get-go but, it is very very fast and it really does what it says on the tin so, you. Know we're using a big table for those lookups in production six, milliseconds, you can do a whole bunch of reads all. At once. And it's like providing, what we need and. So then sort of at the end of the day just wanted to summarize sort.

Of What we already talked about in, a convenient, flow chart flow chart, and. So kind. Of the first question I'd ask if I were reaching for a new database is. How. Developed, is your, project schema, product, sort of whatever you want to however, you want to think about it if. It's if, you're still in that sort of you know finding you know product market fit or, experimenting, with the with the project. Then. I think, either cloud datastore cloud fire stores probably where you want to go, basically. They're. You know key things here it's, got very flexible schemas. You just like set up the documents. And. Then you can you know index a bunch of different ways after the fact, it's. Very easy to work with if. You're using a. Cloud fire store for. A new project you can actually use it in firestorm mode get most of the same thing or sorry in real-time mode get, most of the same benefits that we're talking about and, get sort of being, able to send, stuff directly to your mobile or web clients. So. I, think that they're you know if you're still figuring that out that's, a great choice oh also, cloud. Fire store and cloud datastore just have a per. Operation billing. Model so if you're not using it you're, only paying for storage which is like pretty cheap if you're just getting started. Whereas, cloud BigTable or cloud spanner are a per, node pricing, model so you're paying hourly for, each node that you have setup so. If you're just sort of, it's, real cheap to get started with like cloud firestore cloud datastore. And. Then yeah on the difference, between those two I start to jokingly put is it 2019, or later, but. Yeah really, cloud firestore datastore mode is just cloud. Datastore but better they've, announced that they're gonna move everybody, to cloud firestore so. If you're starting a new project just start with cloud firestore like it's I. Have. Trouble envisioning a, case. In which you actively, want to go to datastore. If you're starting a new project. So. Yeah so, if you're more certain of the needs for your project if this is something where either you.

Know You're it's a new technology for an established, company. Or product or. If. You're you're. Really, really certain of like your performance needs and, you've, tested, it out and have a really good idea then. That's when you should probably reach for a cloud BigTable, or cloud spanner, cloud. Spanner is a lot nicer to work with if it has data types it at sequel it has these globally consistent transactions. So. Generally. Unless you know you're, gonna need to really really optimize. Those like key value lookups and sort of build things like this horrible, reply-all thread, case. Definitely. Go with cloud spanner or we actually built out this index entirely on clouds or this metadata. Indexing, system entirely. On cloud spanner to start with I'm super. Happy that we did even though we ended up moving some of the workload to a cloud BigTable, because it informed, a lot of our decisions sort, of gave us more information on, the use case, and. Then yeah if you just like really really need that one specific tool then. Cloud BigTable is probably, what. You should reach for.

2019-04-14 12:50

Show Video

Comments:

I'm only 6 mins in at this point. She's talking about Datastore. Do they eventually switch over to talking in terms of the current proposition, Firestore? (Or maybe their team switched away from using it before the current Firestore product, and they don't bother to cover the stuff they haven't used?)

(Update: She does switch to covering Firestore @ 11:00.)

I'm only 6 mins in at this point. She's talking about Datastore. Do they eventually switch over to talking in terms of the current proposition, Firestore? (Or maybe their team switched away from using it before the current Firestore product, and they don't bother to cover the stuff they haven't used?) -- UPDATE, from after watching: Essentially the latter: the provided evaluations on what is suitable for what are made based on the old Datastore's capabilities, not with it's successor product, Firestore's capabilities factored in.

Other news