Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Stream Firestore to BigQuery] Limiting or Controlling the number of resources used for extension. #2164

Open
siarheidudko opened this issue Aug 22, 2024 · 0 comments
Labels
type: feature request New feature or request

Comments

@siarheidudko
Copy link

[READ] Step 1: Are you in the right place?

Issues filed here should be about a feature request for a specific extension in this repository. To file a feature request that affects multiple extensions or the Firebase Extensions platform, please reach out to
Firebase support directly.

[REQUIRED] Step 2: Extension name

This feature request is for extension: firebase/firestore-bigquery-export

What feature would you like to see?

I would like you to consider several options for the development of the extension system:

  1. Installation of an extension for several collections.
  2. Limiting or Controlling the number of resources used for extension.

Why?

Today, due to an error in firebase-tools, all my extensions were erased and I had to reinstall them several times.
I have 8 collections that I export to BigQuery, that is I need to install firebase/firestore-bigquery-export 8 times.
Each instance creates:

  • 5 Firebase functions
  • 4 queues (when extension deleted, it turned out that they are not deleted and you need to delete them manually)

Thus, after installing 8 times, I got 40 Firebase functions and 32 queues.

Before that, I used several versions of this extension.
The first version assumed only 1 function. That is, I would have 8 functions, not 40. And not a single queue. But as a software architect, I understand that this option has poor scaling.
After that, you implemented a more fault-tolerant and scalable architecture using 1 queue and two functions (producer and consumer). I used this option before the error that caused me to reinstall all extensions, i.e. I had 8 queues and 16 functions. It was ok, but it could also have been optimized. For example, you could use 1 queue and one consumer for all extensions.

At the same time, the functions have not been optimized in terms of resources consumed. They work fine on 128MB of memory, but the default is 256MB. At the same time, when the extension is reconfigured, the setting drops back to 256MB.
At the same time, only the producer is limited to 3000 instances, and 4 more functions are not limited by the number of copies. And this is for each of the 8 extension instances.

At the same time, when reinstalling and reconfiguring, re-synchronization is now started, which also consumes resources.

The point of the extension is that it performs some simple functionality. And this extension really does not do anything complicated except for streaming insertion into the dataset table. But you use so many resources for this, that rather than spending time monitoring and limiting them, it is easier to abandon its use and write your own solution (it will hardly take more than a couple of days).
Are you sure you turned that way?

How would you use it?

Control over the resources used.

@siarheidudko siarheidudko added the type: feature request New feature or request label Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant