A computational art-analysis framework that formalises what connoisseurs have done for centuries — examining the physical evidence a painter leaves on a canvas — using self-supervised vision transformers, Bayesian inference, and cross-attention fusion.
Each grounded in art-historical practice and modern computer vision
Structure-tensor analysis and coherence clustering reveal the characteristic facture of individual painters.
Multi-axis classification along period, school, and genre using dual-backbone embeddings.
Cosine-similarity ranking against a reference gallery with confidence intervals and temporal cross-validation.
One-class anomaly detection with z-score indicators, stress-tested against adversarial forgery simulations.
Dirichlet Process Gaussian Mixture Models infer how many distinct hands contributed to a painting.
Gaussian process regression over dated embeddings models an artist's stylistic evolution over time.
WikiArt (81,444 images) and 126-artist forgery validation
| Backbone | Style | F1 | Artist | Top-5 | Genre |
|---|---|---|---|---|---|
| DINOv2 ViT-B/14 | 57.5% | 0.553 | 64.7% | 90.9% | 71.0% |
| CLIP ViT-L/14 | 67.1% | 0.656 | 74.6% | 95.9% | 75.0% |
| Fusion (frozen) | 65.0% | 0.633 | 71.0% | 94.2% | 74.2% |
| Fusion (fine-tuned) | 71.6% | 0.703 | 77.8% | 96.2% | 75.1% |
| Fusion (end-to-end) | 72.7% | — | 79.0% | 96.9% | 76.6% |
All numbers macro-averaged. Fine-tuning: SupCon + CE, 3-block unfreeze, 5 epochs, Tesla P100.
One-class anomaly detection per artist. ROC-AUC on held-out sets. Results vary by corpus size.
Dual backbones, cross-attention fusion, and six analytical heads
Building on decades of computational art analysis and self-supervised vision
Full bibliography with DOIs available in the README and methodology documentation.
From pip install to first analysis in under a minute