This ties to my site https://MtgDiscovery.com - A shock, right?
Sometimes I have to update sets and re-ingest data. This happens a lot when the feed doesn't align with how I collect the cards. Best example is when the size of the set represented on the "booster" cards is X and the ingested data has the set size as X+Y; where Y is the "special" cards with borderless, or other fancy things.
My initial ingestion will put ALL of those cards into the main set. Well... I don't care about the "Extended" set for most of the sets. Some I go overboard and collect them; but it's a distinct thing to collect for me.
Since I have to sometimes update sets, there have been instances where I added 'extended' cards while they were part of the core set.
Once the new set is created; that card is "corrupt". It's associated with the wrong set, and will show in some counts; but is inaccessible to update/remove.
I needed a process to find and fix these cards. Since there will be a lot of reads, I figured it's a great case for the IAsyncEnumerable
that exists. I don't need everything in memory... so let's grab just the thing, process, and repeat.
Except... No. it hurt. A lot. So bad it actually kept crashing the process.
The difference is nearly 95% of my available DTUs.
The only difference is that it reads in ALL of the cards of a set before processing any of them and it really doesn't even process them. It just checks if that card needs to be updated.
The flow is
1) Get all Sets
2) For each Set
2.1) Get all cards in set
2.2) Limit set cards to just collected cards
2.3) For collected card, update collected set id
Pretty straight forward. the "limit to collected cards" had the CollectedCards set to use the IAsyncEnumerable.
It'd call the DB A LOT.
I switched the retrieval of the collected cards to pull in all records... and performance is SO much better.
My lesson from this is that on the Azure SQL Basic Tier; don't use IAsyncEnumerable unless you've perf tested it. My perf was SHIT when I did it. Mostly because I kept spiking my DTU usage to 100%, solely from the IAsyncEnumerable.
Anyway - a fun learning and something I'll keep in mind for the future when I want to pull data in like this.