Neeto Status

RSS feed Support

Live Status\Incident History\Experiencing video processing getting stuck randomly

Experiencing video processing getting stuck randomly

Outage

Opened: Dec 5, 2024, 5:17 AM UTC

Duration: 4hrs 35min 58sec

Opened
Dec 5, 2024, 5:17 AM UTC
We are currently investigating this issue.
Resolved
Dec 5, 2024, 9:52 AM UTC
There was an issue with our AWS Lambda crashing because of memory limits. We have increased the memory limit and have manually processed the affected videos.
Resolved
Dec 13, 2024, 2:54 PM UTC
What happened: - A 2-hour recording was uploaded, which had a size of 1.6 GB. - Our AWS Lambda was configured to have 3GB RAM and 10GB of ephemeral storage. - Because of the size and processing of the recording, AWS Lambda ran out of memory and crashed. - The crash caused a failure in generating a transcript for this particular recording. So why other recordings were not processed: - Contrary to our understanding, AWS lambda can use the same execution environment for subsequent invocations. - For processing the WEBM file stored on S3, we download it and store it in the /tmp directory of the Lambda environment. Temporary files generated because of the subsequent operations to this file are also stored in the same /tmp directory. We manually clear the /tmp directory after every execution. But the /tmp directory did not get cleared because of the crash. Because of that, a few of the subsequent invocations also crashed because storage crossed the 10GB limit. Action items related to the above issue: - We manually processed the files for which the transcoding had failed - https://github.com/bigbinary/neeto-record-web/blob/main/docs/ops/transcoding-files-manually.md - We have bumped up the RAM to 7 GB. Ephemeral storage is already set to a maximum of 10GB. Other action items identified: - https://github.com/bigbinary/neeto-record-web/issues/2658 - Transcript generation was taking a lot of time for longer videos. We identified the fix and made transcript generation faster. - We bumped up the AWS Lambda timeout from 5 to 15 mins. This will ensure that larger files won't timeout. - https://github.com/bigbinary/neeto-record-web/issues/2657 - We created a cron job that runs every hour to process unprocessed files. - https://github.com/bigbinary/neeto-record-web/issues/2655 - We updated the code in AWS Lambda to clear the /tmp directory at the start of the execution as well. - https://github.com/bigbinary/neeto-record-web/issues/2659 - We set up an alarm to report crashes in AWS Lambda.