Discovery Portal

The Posters.science discovery portal enables users to search, explore, and access publicly shared scientific posters through advanced search capabilities and AI-powered discovery tools.

Overview

The discovery portal allows searching through publicly shared posters. Meilisearch powers search functionality with full-text queries and filtered searches (author, year, conference). When viewing a poster, users see structured metadata, abstract, and a link to the repository file. Redis caches frequent queries for performance. The platform supports user-driven feedback for metadata corrections and contributions.

Search Capabilities

Traditional Search

Full-Text Search: Search across poster content and metadata
Faceted Filtering: Filter by author, year, conference, repository
Sorting Options: Relevance, date, title, author
Search Suggestions: Auto-complete and query suggestions

AI-Enhanced Discovery

Natural Language Queries: Conversational search interface
Contextual Understanding: Semantic search capabilities
Related Content: AI-powered recommendations

Smart Search Implementation

Smart Search enables natural language questions with AI-generated summaries and links to relevant posters. The Retrieval Augmented Generation (RAG) pipeline:

Embeds user queries
Performs vector similarity search against pre-computed poster embeddings in the database
Retrieves the top 5 posters
Passes them as context to an LLM for response generation

Embedding Model

The bge-large-en-v1.5 embedding model (Apache 2.0 license) generates 1024-dimensional embeddings from up to 512 tokens. Each poster is embedded by combining title, authors, conference, abstract, keywords, and content.

The 512-token limit is addressed through weighted averaging:

Title: 25%
Abstract: 35%
Keywords: 20%
Content: 15%
Metadata: 5%

Separate embeddings are combined into a weighted average.

Vector Storage

Pgvector (PostgreSQL extension) stores embeddings directly in the database. The poster_embeddings table uses an IVFFlat index optimized for cosine similarity, configured with 100 lists for ~50,000 posters (10-50ms query performance).

Query Processing

Query processing includes:

Named entity recognition for conferences/years
Medical term expansion with synonyms
Embedding for vector similarity search

Response Generation

Response generation uses Llama 3.3 70B (4-bit quantization, vLLM deployment) with an 8,192-token context window. The prompt provides context from five retrieved posters and instructs the LLM to synthesize information, cite sources, and respond in under 300 words.

Performance Targets

Evaluation targets:

Precision@5 ≥80%
Recall@5 ≥60%
SciBERT similarity ≥0.70
Human evaluation ≥4.0/5.0
Hallucination rate <5%
Total latency ❤️ seconds

Redis caches identical queries for 24 hours. Poster embeddings are pre-computed nightly.

Overview Page Analytics

The Overview page provides visualizations of database trends and patterns:

Poster Growth Over Time: time-series chart of monthly registrations
Top Institutions: bar chart of 20 institutions with most posters
Research Domain Distribution: treemap showing poster distribution across fields
Conference Landscape: network graph mapping conference ecosystem
Funding Landscape: Sankey diagram showing funder-to-domain flows
Geographic Distribution: world map with choropleth coloring
Collaboration Network: force-directed graph of inter-institutional collaborations

Technical Implementation

Visualizations use Apache ECharts or D3.js in Vue 3 components. Data is served via API endpoints querying PostgreSQL materialized views (refreshed nightly, cached in Redis for 24 hours).

The dashboard includes:

Interactive filtering
Drill-down capabilities
Export options (PNG, SVG, CSV)

Accessibility Features

High contrast mode
Keyboard navigation
Screen reader support

Technical Implementation

Search Engine

Meilisearch: Fast, typo-tolerant search

Performance Optimization

Redis Caching: Frequent query caching
Client-side Caching: Browser storage optimization

Discovery Portal ​

Overview ​

Search Capabilities ​

Traditional Search ​

AI-Enhanced Discovery ​

Smart Search Implementation ​

Embedding Model ​

Vector Storage ​

Query Processing ​

Response Generation ​

Performance Targets ​

Overview Page Analytics ​

Technical Implementation ​

Accessibility Features ​

Technical Implementation ​

Search Engine ​

Performance Optimization ​

Discovery Portal

Overview

Search Capabilities

Traditional Search

AI-Enhanced Discovery

Smart Search Implementation

Embedding Model

Vector Storage

Query Processing

Response Generation

Performance Targets

Overview Page Analytics

Technical Implementation

Accessibility Features

Technical Implementation

Search Engine

Performance Optimization