Changes in version 1.3.0 New Features Adaptive pairing & ranking framework - Introduced a full adaptive pairing / adaptive ranking framework designed to efficiently rank large sets of writing samples using uncertainty-aware pair selection and Bayesian inference. - Added adaptive_rank(), the primary user-facing wrapper that runs the complete adaptive workflow end-to-end, including warm start, adaptive pairing rounds, Bayesian BTL refits, diagnostics, and stopping. - Advanced control is available via: - adaptive_rank_start() — initialize an adaptive run and state - adaptive_rank_run_live() — execute live adaptive comparisons - adaptive_rank_resume() — resume interrupted or long-running runs These functions are intended for custom orchestration and fault-tolerant execution. - Adaptive pairing is organized into rounds that balance global scale identification and local refinement using a mixture of anchor, long-range, mid-range, and local comparisons. - The adaptive controller tracks a global identifiability state based on Bayesian diagnostics and agreement between online (TrueSkill) and global (BTL) rankings. Once the global scale is identified: - long-range comparisons are automatically tapered, - comparison budget is reallocated toward local and boundary-refining pairs, - exploration rates are reduced to focus on decision-relevant uncertainty. - Long-range comparisons are additionally posterior-gated in later stages, preventing wasted comparisons on pairs that are already decisively ordered. - Late-stage local pairing prioritizes near-tie pairs, with limited, auditable overrides to degree caps when especially informative comparisons are blocked. - Adaptive runs produce fully auditable step-, round-, and refit-level logs, recording candidate generation, fallbacks, gating decisions, quota reallocations, and stopping criteria. - All adaptive workflows use standardized configuration, state, and logging contracts to ensure reproducibility and future extensibility. Bayesian Bradley–Terry–Luce (BTL) modeling - Added a fully Bayesian Bradley–Terry–Luce (BTL) model implemented via CmdStan, providing posterior uncertainty estimates for item skill parameters. - New entrypoint fit_bayes_btl_mcmc() enables direct posterior inference from pairwise comparison data, independent of or integrated with adaptive workflows. - Supports multiple model variants (including error and positional bias extensions) and optional refitting on increasing subsets of comparisons. - Bayesian BTL outputs integrate seamlessly with adaptive ranking utilities (summarize_items(), summarize_refits()), serving as the statistical backbone for adaptive pairing decisions. Model support & live API improvements - Added support for the Gemini Flash model gemini-3-flash-preview for live pairwise comparisons. - Added support for OpenAI service tiers / priority routing via service_tier for applicable live models. - Enables tiers such as "flex" and "priority" when supported by the selected model. - Integrated into the live submission path without requiring changes to calling code. Documentation - Expanded and clarified documentation for adaptive ranking, Bayesian BTL, and live model configuration. - Updated examples to reflect new adaptive and Bayesian APIs. Changes in version 1.2.0 New Features - Parallel Processing: - submit_llm_pairs() and backend-specific live functions (OpenAI, Anthropic, Gemini, Together, Ollama) now support parallel execution via parallel = TRUE and workers = n (requires the {future} package). - Incremental Saving & Resume: - Added save_path argument to live submission functions. Results are saved to CSV incrementally, allowing interrupted jobs to resume automatically by skipping previously processed pairs. - Robust Error Handling: - Failed API calls no longer stop the entire process. Failures are captured and returned separately, allowing for easier inspection and re-submission. - Added estimate_llm_pairs_cost() to estimate costs in live and batch mode. - Introduced llm_submit_pairs_multi_batch() and llm_resume_multi_batches() to split large comparison sets across multiple batches and resume polling later. These helpers support writing per‑batch and combined results, along with an optional jobs registry. Bug fixes - The prompt format for anthropic batch comparisons now match the anthropic live format. - Reverse consistency functions can now handle duplicate pairs. Breaking Changes - submit_llm_pairs() and its backend-specific counterparts now return a list containing two elements: $results (a tibble of successful comparisons) and $failed_pairs (a tibble of inputs that failed). Previous versions returned a single tibble. Changes in version 1.1.0 (2025-12-22) Models - Added GPT-5.2 - Ensured models can be called with date format, e.g. gpt-5.2-2025-12-11 - Default temperature setting is set to 0 for non-reasoning models, provider default for reasoning models (typically 1) Tests - Tests added to improve coverage Documentation - Changed pkgdown site layout - Added codemeta.json - Added repo logo - Updated function examples - Add references to Description Miscellaneous - No longer set global variables, now done in individual functions - Added verbose option in fit_bt_model() and summarize_bt_fit() - Moved null coalescing helper to separate R file - Changed validation of API keys in multiple functions Changes in version 1.0.0 - Initial release. - Unified live and batch LLM comparison framework (OpenAI / Anthropic / Gemini). - Live support for Together.ai and local Ollama backends. - Tools for Bradley–Terry and Elo models, positional bias checks