No description
  • Python 79%
  • Shell 21%
Find a file
2026-01-23 10:12:16 +01:00
data_analysis Add all data files 2026-01-23 01:23:27 +00:00
dataset Remove institutional references 2026-01-23 01:30:22 +00:00
.gitignore Update gitignore 2026-01-23 01:21:47 +00:00
README.md Update README.md 2026-01-23 10:12:16 +01:00
requirements.txt Initial commit 2026-01-23 01:17:32 +00:00

WebUI95 : UI-to-Code Generation and Analysis

This repository contains the code and data for the Web++ project, which evaluates UI-to-code generation models.

Repository Structure

web++/
├── WebUI95/                    # UI-to-Code Experiment Pipeline
│   ├── output/                 # Final generated HTML (63,824 samples)
│   │   └── final_html/         # Generated static HTML files
│   ├── final_results_64k.zip   # Compressed final results
│   ├── slurm/                  # SLURM job scripts
│   └── *.py                    # Processing scripts
│
├── data_analysis/              # Quality and Diversity Analysis
│   ├── analysis/               # Analysis results
│   │   ├── *.json              # Embedding and metric files
│   │   └── plots/              # Generated figures
│   ├── webgen_bench/           # WebGen-Bench baseline
│   │   └── fixed_samples.zip   # Fixed prompt experiments (475 samples)
│   ├── scripts/                # Analysis scripts
│   └── slurm/                  # SLURM job scripts
│
└── venv/                       # Python virtual environment

Datasets

WebUI95 Dataset

  • Location: WebUI95/output/final_html/ or WebUI95/final_results_64k.zip
  • Size: 63,824 UI samples
  • Format: Each sample contains generated static HTML/CSS

WebGen-Bench Fixed Prompt Samples

  • Location: data_analysis/webgen_bench/fixed_samples.zip
  • Size: 475 samples
  • Purpose: Control experiment for diversity analysis (same prompt, varying seeds)

Key Scripts

WebUI95 Pipeline

Script Description
run_inference_multigpu.py Multi-GPU inference for UI-to-code generation
process_unified_data.py Data preprocessing
merge_outputs.py Merge outputs from multiple workers

Data Analysis

Script Description
scripts/uiclip_analysis.py UIClip quality and similarity scoring
scripts/compute_code_embeddings.py Qwen code embeddings for diversity
scripts/generate_figures.py Generate analysis figures
scripts/html_statistics.py DOM complexity analysis

Results

Visual Quality (UIClip)

Metric Original Generated
Mean Quality Score 0.487 0.518
Mean Similarity - 0.938

Diversity Analysis

Type Original Generated Synthetic (Fixed)
Visual 0.150 0.137 0.140
Code 0.119 0.049 0.009

Key Finding: The synthetic fixed-prompt experiment shows near-zero code diversity (0.009), confirming mode collapse when using identical prompts. Our generated approach maintains significantly better diversity.

Figures

Generated figures are in data_analysis/analysis/plots/:

  • figure2_html_statistics.png - HTML structure comparison
  • figure3_combined_diversity_comparative.png - Diversity analysis

Requirements

  • Python 3.10+
  • PyTorch 2.0+
  • Transformers
  • Playwright (for screenshot generation)
  • UIClip model: biglab/uiclip_jitteredwebsites-2-224-paraphrased_webpairs_humanpairs
  • Qwen models: Qwen/Qwen3-Coder-30B-A3B-Instruct, Qwen/Qwen2.5-Coder-1.5B

Compute Environment

  • GPU: NVIDIA A40 (48GB VRAM)
  • Partitions: gpu_a40, cpu_sapphire

Citation

If you use this code or data, please cite:

@inproceedings{webuininetyfive2026,
  title={WebUI-95: A Large-Scale Dataset of Normalized Web Interfaces via UI-to-Code Generation},
  author={...},
  booktitle={CHI Posyets},
  year={2026}
}

Acknowledgments

Computational resources provided by institutional HPC cluster.