Summary
When extracting metadata for some repositories with SOCA, the extraction fails because SOMEF skips repository archives larger than a hardcoded 200 MB download limit
This affects SOCA command:
soca extract -i <repos-file> -o <output-dir>
For repositories whose GitHub archive is larger than 200 MB, SOCA/SOMEF logs warnings like:
WARNING - Repository archive skipped due to size limit: 200 MB or not content lenght.
ERROR - Error processing the target repository
As a result, the repository metadata is not extracted because in SOMEF the download limit is hardcoded in the constants:
SIZE_DOWNLOAD_LIMIT_MB = 200
Summary
When extracting metadata for some repositories with SOCA, the extraction fails because SOMEF skips repository archives larger than a hardcoded 200 MB download limit
This affects SOCA command:
soca extract -i <repos-file> -o <output-dir>For repositories whose GitHub archive is larger than 200 MB, SOCA/SOMEF logs warnings like:
As a result, the repository metadata is not extracted because in SOMEF the download limit is hardcoded in the constants:
SIZE_DOWNLOAD_LIMIT_MB = 200