Changes in version 1.3.0                        

New Features

Adaptive pairing & ranking framework

  - Introduced a full adaptive pairing / adaptive ranking framework
    designed to efficiently rank large sets of writing samples using
    uncertainty-aware pair selection and Bayesian inference.

  - Added adaptive_rank(), the primary user-facing wrapper that runs the
    complete adaptive workflow end-to-end, including warm start,
    adaptive pairing rounds, Bayesian BTL refits, diagnostics, and
    stopping.

  - Advanced control is available via:
    
      - adaptive_rank_start() — initialize an adaptive run and state
      - adaptive_rank_run_live() — execute live adaptive comparisons
      - adaptive_rank_resume() — resume interrupted or long-running runs
        These functions are intended for custom orchestration and
        fault-tolerant execution.

  - Adaptive pairing is organized into rounds that balance global scale
    identification and local refinement using a mixture of anchor,
    long-range, mid-range, and local comparisons.

  - The adaptive controller tracks a global identifiability state based
    on Bayesian diagnostics and agreement between online (TrueSkill) and
    global (BTL) rankings. Once the global scale is identified:
    
      - long-range comparisons are automatically tapered,
      - comparison budget is reallocated toward local and
        boundary-refining pairs,
      - exploration rates are reduced to focus on decision-relevant
        uncertainty.

  - Long-range comparisons are additionally posterior-gated in later
    stages, preventing wasted comparisons on pairs that are already
    decisively ordered.

  - Late-stage local pairing prioritizes near-tie pairs, with limited,
    auditable overrides to degree caps when especially informative
    comparisons are blocked.

  - Adaptive runs produce fully auditable step-, round-, and refit-level
    logs, recording candidate generation, fallbacks, gating decisions,
    quota reallocations, and stopping criteria.

  - All adaptive workflows use standardized configuration, state, and
    logging contracts to ensure reproducibility and future
    extensibility.

Bayesian Bradley–Terry–Luce (BTL) modeling

  - Added a fully Bayesian Bradley–Terry–Luce (BTL) model implemented
    via CmdStan, providing posterior uncertainty estimates for item
    skill parameters.
  - New entrypoint fit_bayes_btl_mcmc() enables direct posterior
    inference from pairwise comparison data, independent of or
    integrated with adaptive workflows.
  - Supports multiple model variants (including error and positional
    bias extensions) and optional refitting on increasing subsets of
    comparisons.
  - Bayesian BTL outputs integrate seamlessly with adaptive ranking
    utilities (summarize_items(), summarize_refits()), serving as the
    statistical backbone for adaptive pairing decisions.

Model support & live API improvements

  - Added support for the Gemini Flash model gemini-3-flash-preview for
    live pairwise comparisons.
  - Added support for OpenAI service tiers / priority routing via
    service_tier for applicable live models.
      - Enables tiers such as "flex" and "priority" when supported by
        the selected model.
      - Integrated into the live submission path without requiring
        changes to calling code.

Documentation

  - Expanded and clarified documentation for adaptive ranking, Bayesian
    BTL, and live model configuration.
  - Updated examples to reflect new adaptive and Bayesian APIs.

                        Changes in version 1.2.0                        

New Features

  - Parallel Processing:
      - submit_llm_pairs() and backend-specific live functions (OpenAI,
        Anthropic, Gemini, Together, Ollama) now support parallel
        execution via parallel = TRUE and workers = n (requires the
        {future} package).
  - Incremental Saving & Resume:
      - Added save_path argument to live submission functions. Results
        are saved to CSV incrementally, allowing interrupted jobs to
        resume automatically by skipping previously processed pairs.
  - Robust Error Handling:
      - Failed API calls no longer stop the entire process. Failures are
        captured and returned separately, allowing for easier inspection
        and re-submission.
  - Added estimate_llm_pairs_cost() to estimate costs in live and batch
    mode.
  - Introduced llm_submit_pairs_multi_batch() and
    llm_resume_multi_batches() to split large comparison sets across
    multiple batches and resume polling later. These helpers support
    writing per‑batch and combined results, along with an optional jobs
    registry.

Bug fixes

  - The prompt format for anthropic batch comparisons now match the
    anthropic live format.
  - Reverse consistency functions can now handle duplicate pairs.

Breaking Changes

  - submit_llm_pairs() and its backend-specific counterparts now return
    a list containing two elements: $results (a tibble of successful
    comparisons) and $failed_pairs (a tibble of inputs that failed).
    Previous versions returned a single tibble.

                 Changes in version 1.1.0 (2025-12-22)                  

Models

  - Added GPT-5.2
  - Ensured models can be called with date format, e.g.
    gpt-5.2-2025-12-11
  - Default temperature setting is set to 0 for non-reasoning models,
    provider default for reasoning models (typically 1)

Tests

  - Tests added to improve coverage

Documentation

  - Changed pkgdown site layout
  - Added codemeta.json
  - Added repo logo
  - Updated function examples
  - Add references to Description

Miscellaneous

  - No longer set global variables, now done in individual functions
  - Added verbose option in fit_bt_model() and summarize_bt_fit()
  - Moved null coalescing helper to separate R file
  - Changed validation of API keys in multiple functions

                        Changes in version 1.0.0                        

  - Initial release.
  - Unified live and batch LLM comparison framework (OpenAI / Anthropic
    / Gemini).
  - Live support for Together.ai and local Ollama backends.
  - Tools for Bradley–Terry and Elo models, positional bias checks