top of page

49K+

PP-Structure v3

Key Features:

Live Video OCR with Tracking: Real-time text recognition + object tracking in videos.
Multimodal Input (Image/Video/Audio): Speech-to-text integration for contextual analysis.
GDPR Compliance: Built-in redaction for sensitive data (e.g., IDs, bank numbers).
Cloud-Scale Processing: Dynamic token handling for large documents.
Tool Integration: Search as a tool, code execution for custom workflows.

Model Deployment Status:

General Availability Yes (Enterprise-only)

Supported Data Types for Input Image, Video, Audio, Live Feeds

Supported Data Types for Output Text + Semantic Annotations

Supported # Tokens for Output 16k (context-aware)

Knowledge Cutoff June 2024

Best For:

Surveillance video analytics
Automated compliance reporting

Availability:

Gemini API
AWS Bedrock

PP-Structure v3

bottom of page