Lately, I've been working with different video generation solutions for a client's marketing automation tool. Implementation usually involves asynchronous processing, as video generation takes time you send a task and check its status. Key points based on my experience, check the maximum video length, supported resolutions, price per second of output, and whether the platform handles physical simulation well. Some APIs have problems with realistic movement, while others provide better coordination of objects in scenes. I would advise you to start with API Sora 2, i like its results
https://yesai.su/en/docs/sora2