2

I'm working on a transcriber API. It has one endpoint /transcribe, and takes an audio file to be transcribed. I want to add some Pydantic validators (make sure that the MIME type is audio/, make sure that the audio is big enough (bigger than 1 sec). I have not found any way to validate data coming from a form.

I have tried making an Audio base model than inherits from both the BaseModel and UploadFile, but it is not working:

# Audio validator
class Audio(UploadFile, BaseModel):
    content_type: str
    data: bytes

    @validator('data')
    def validate_data(cls,data):
        if data <= 1:
            raise ValueError("Audio file uploaded is too short.")
        return data
    
    @validator('content_type')
    def validate_type(cls,content_type):
        if not content_type.startswith("audio/"):
            raise TypeError("The file uploaded is not an audio file.")

The endpoint and header are as follows:

@app.post("/transcribe")
async def transcribe(request: Request, lang: str = None, file: Audio = File(...), doctor_name: str = "Dr"):
    audio = await file.read()

Any advice?

Ralph Aouad
  • 414
  • 4
  • 10

0 Answers0