I am currently using the Azure Machine Learning Python SDK, using the incremental embbedding tutorial that uses Ada 002: https://github.com/Azure/azureml-examples/blob/main/sdk/python/generative-ai/rag/notebooks/faiss/url_to_faiss_incremental_embeddings_with_tabular_data.ipynb
I have a XLSX file with 533,000 rows. I am unable to crack and chunk it as a XLSX, although I am able to process it as a TXT, but then ill get a pipeline error when it tries to embed it.
Is there a cap to cracking and chunking? Is there a cap to the amount of data that can be embedded?
The model works fine with smaller models.
Thanks
Changing the file type, making smaller files.