# Robot News - Extended LLM Context File # https://robot-new.com/llms-full.txt # Version: 1.1.0 # Last Updated: 2026-02-14 # # This file provides comprehensive information about the Robot News platform # for AI/LLM systems to better understand and interact with our content. # # For a shorter summary, see: https://robot-new.com/llms.txt ================================================================================ SITE OVERVIEW ================================================================================ > Robot News is an automated RSS feed aggregation platform that collects, > categorizes, and displays robotics news from industry-leading sources. > The platform is designed to be the go-to destination for the latest > robotics, AI, and automation news. ## Mission Statement To provide a centralized, curated source of robotics industry news by automatically aggregating content from trusted sources, applying intelligent categorization, and presenting it in a clean, accessible format. ## Key Value Propositions 1. **Aggregation**: Collect news from 5+ trusted robotics industry sources 2. **Categorization**: Automatically classify articles into 6 main categories 3. **Deduplication**: Smart algorithms prevent duplicate content 4. **Freshness**: Content updated every 60 minutes via RSS polling 5. **Search**: Full-text search across all articles 6. **Attribution**: All content links to and credits original sources ================================================================================ SITE INFORMATION ================================================================================ | Property | Value | |-------------------|------------------------------------------| | Site Name | Robot News | | Domain | https://robot-new.com | | API Endpoint | https://api.robot-new.com | | Platform Type | News Aggregation | | Primary Focus | Robotics, AI, Automation, Drones | | Language | English | | Update Frequency | Every 60 minutes (RSS polling) | | Content Type | News articles, industry updates | | Target Audience | Robotics professionals, researchers, | | | enthusiasts, investors, journalists | ================================================================================ CONTENT CATEGORIES ================================================================================ The site organizes robotics news into six main categories, each with a distinct focus area: ## 1. Industry News (slug: industry-news) - Color: Blue (#3B82F6) - Icon: 📰 - Focus: Business updates, market trends, company announcements, mergers and acquisitions, funding rounds, partnerships, product launches, quarterly reports, industry analysis - Example topics: Boston Dynamics acquisitions, robotics startup funding, manufacturing sector automation trends ## 2. Research (slug: research) - Color: Purple (#8B5CF6) - Icon: 🔬 - Focus: Academic papers, scientific breakthroughs, university research, peer-reviewed publications, experimental systems, novel algorithms, theoretical advances - Example topics: New locomotion algorithms, soft robotics materials, machine learning for robot control, bio-inspired robotics ## 3. Humanoids (slug: humanoids) - Color: Pink (#EC4899) - Icon: 🤖 - Focus: Humanoid robots, bipedal systems, social robots, human-robot interaction, service robots, companion robots, android development - Example topics: Tesla Optimus, Boston Dynamics Atlas, Agility Robotics Digit, Figure AI, social robots in healthcare ## 4. Industrial Robots (slug: industrial-robots) - Color: Orange (#F97316) - Icon: 🏭 - Focus: Manufacturing automation, assembly robots, welding robots, warehouse automation, logistics robots, collaborative robots (cobots), robotic arms, pick-and-place systems - Example topics: FANUC robots, ABB automation, Amazon warehouse robots, automotive manufacturing, food processing automation ## 5. Drones (slug: drones) - Color: Cyan (#06B6D4) - Icon: 🚁 - Focus: UAVs (Unmanned Aerial Vehicles), aerial robotics, autonomous flight systems, delivery drones, inspection drones, agricultural drones, drone regulations, eVTOL - Example topics: Drone delivery services, agricultural spraying drones, infrastructure inspection, urban air mobility, FAA regulations ## 6. AI/Automation (slug: ai-automation) - Color: Green (#10B981) - Icon: 🧠 - Focus: Machine learning for robotics, computer vision, autonomous systems, reinforcement learning, robot perception, AI safety, neural networks for control, sensor fusion - Example topics: Vision-language models for robots, GPT for robotics, autonomous vehicle AI, robot manipulation learning ================================================================================ RSS FEED SOURCES ================================================================================ Articles are aggregated from these trusted robotics news sources: ## Primary Sources (Tier 1) ### The Robot Report - Website: https://therobotreport.com - RSS Feed: https://therobotreport.com/feed/ - Description: Leading source for robotics industry news, market analysis, and business updates. Covers startups, funding, and industry trends. - Categories: Industry News, Research, Humanoids, Industrial Robots ### Robohub - Website: https://robohub.org - RSS Feed: https://robohub.org/feed/ - Description: Non-profit organization providing robotics research news, educational content, and community resources. - Categories: Research, Humanoids, AI/Automation ### IEEE Spectrum Robotics - Website: https://spectrum.ieee.org/topic/robotics/ - RSS Feed: https://spectrum.ieee.org/feeds/topic/robotics.rss - Description: IEEE's flagship publication covering robotics technology, engineering innovations, and scientific advances. - Categories: Research, Industrial Robots, AI/Automation ### MIT News - Robotics - Website: https://news.mit.edu/topic/robotics - RSS Feed: https://news.mit.edu/rss/topic/robotics - Description: Research news from MIT's robotics labs, including CSAIL and the Media Lab. Cutting-edge academic research. - Categories: Research, Humanoids, AI/Automation ### Robotics & Automation News - Website: https://roboticsandautomationnews.com - RSS Feed: https://roboticsandautomationnews.com/feed/ - Description: Global coverage of industrial robotics, automation systems, and manufacturing technology. - Categories: Industry News, Industrial Robots, Drones ================================================================================ SITE PAGES ================================================================================ ## Public Pages ### Homepage (/) - Purpose: Landing page with site overview and featured content - Content: Hero section, featured articles, category highlights, statistics - Update Frequency: Daily ### News Page (/news/) - Purpose: Full article listing with filtering and search capabilities - Features: - Pagination (20 articles per page, up to 100) - Category filtering (dropdown with all 6 categories) - Full-text search (searches title, summary, content) - Date range filtering - Source filtering - Sort by date (newest first, oldest first) - Update Frequency: Every 60 minutes ### Sources Page (/sources/) - Purpose: RSS source information and health monitoring dashboard - Content: - List of all RSS sources with logos and descriptions - Health status indicators (Healthy, Warning, Error, Inactive) - Last successful fetch timestamps - Article counts per source - Fetch success/failure statistics - Update Frequency: Every 5 minutes ### Stats Dashboard (/stats/) - Purpose: Server-rendered aggregation statistics with original data - Content: Total articles, category breakdown, source breakdown, publishing trends, most-viewed articles, market context - Schema: FAQPage, Dataset (schema.org) - Update Frequency: Hourly (ISR revalidation) ### Insights Hub (/insights/) - Purpose: Editorial content section with guides and analysis - Content: Curated guides on robotics industry topics - Schema: BreadcrumbList, ItemList ### What Is Robotics (/insights/what-is-robotics/) - Purpose: Comprehensive pillar content — robotics overview guide - Content: Definition, types of robots, key components, applications, robotics vs AI comparison, market data, comparison tables, FAQ - Schema: Article, FAQPage - Target: AI citation for "what is robotics" and related queries ### Authentication Pages - /auth/signin/ - Google OAuth sign-in page - /auth/error/ - Authentication error handling - /auth/unauthorized/ - Access denied page ## Future Pages (Planned) - /article/[slug]/ - Individual article detail page - /submit/ - User article submission form - /my-articles/ - User's submitted articles dashboard - /admin/moderate/ - Moderator article review queue ================================================================================ API REFERENCE ================================================================================ The Robot News API provides programmatic access to all platform data. ## Base URL https://api.robot-new.com ## Authentication - Most endpoints are public (no authentication required) - User-specific endpoints require JWT Bearer token - Rate limiting: 100 requests per minute per IP ## Response Format All responses are JSON with consistent structure: ```json { "data": [...], "meta": { "total": 100, "page": 1, "limit": 20, "totalPages": 5, "hasNextPage": true, "hasPrevPage": false } } ``` ## Endpoints ### GET /api/health Health check endpoint for monitoring. Response: { "status": "ok", "timestamp": "...", "services": {...} } ### GET /api/news Paginated list of articles with filtering and search. Query Parameters: | Parameter | Type | Default | Description | |------------|---------|---------|---------------------------------------| | page | integer | 1 | Page number (1-indexed) | | limit | integer | 20 | Items per page (max 100) | | category | string | - | Filter by category slug | | source | string | - | Filter by source ID | | search | string | - | Full-text search query | | startDate | string | - | Filter from date (ISO 8601) | | endDate | string | - | Filter to date (ISO 8601) | | sort | string | date | Sort field: date, title, views | | order | string | desc | Sort order: asc, desc | Example Request: GET /api/news?page=1&limit=10&category=humanoids&search=boston%20dynamics Example Response: ```json { "data": [ { "id": "clx1abc...", "title": "Boston Dynamics Unveils New Atlas Robot", "slug": "boston-dynamics-unveils-new-atlas-robot", "summary": "The latest iteration of the Atlas humanoid...", "url": "https://therobotreport.com/boston-dynamics-atlas...", "imageUrl": "https://therobotreport.com/wp-content/...", "publishedAt": "2026-01-21T10:30:00Z", "author": "John Smith", "category": { "id": "...", "name": "Humanoids", "slug": "humanoids" }, "source": { "id": "...", "name": "The Robot Report", "websiteUrl": "https://therobotreport.com" }, "tags": ["humanoid", "bipedal", "boston-dynamics"] } ], "meta": { "total": 45, "page": 1, "limit": 10, "totalPages": 5, "hasNextPage": true, "hasPrevPage": false } } ``` ### GET /api/news/:id Get a single article by ID. Response: { "article": {...} } ### GET /api/news/trending Get trending articles based on view count. Query Parameters: | Parameter | Type | Default | Description | |-----------|---------|---------|----------------------------| | days | integer | 7 | Lookback period in days | | limit | integer | 10 | Number of articles (max 50)| ### GET /api/news/stats Get content statistics. Response: { "total": 1234, "byCategory": {...}, "bySource": {...} } ### GET /api/sources List all RSS sources. Response: { "sources": [...], "total": 5 } ### GET /api/sources/health Get source health status. Response includes health status for each source: - "healthy": Last fetch successful, no errors - "warning": Some failures but still fetching - "error": Multiple consecutive failures - "inactive": Source is disabled ### GET /api/categories List all article categories. Response: { "categories": [...], "total": 6 } ================================================================================ DATA STRUCTURES ================================================================================ ## Article Schema ``` Article { id: string (CUID) title: string slug: string (URL-friendly, unique) summary: string | null (max 500 chars) content: string | null (full text if available) url: string (original article URL, unique) imageUrl: string | null (featured image URL) publishedAt: datetime author: string | null status: ArticleStatus viewCount: integer (default 0) sourceId: string | null (RSS source reference) categoryId: string (category reference) tags: Tag[] (many-to-many) createdAt: datetime updatedAt: datetime } ArticleStatus enum: - DRAFT: User draft, not submitted - UNDER_REVIEW: Submitted for moderation - APPROVED: Approved by moderator - REJECTED: Rejected by moderator - PUBLISHED: Visible to all users ``` ## Category Schema ``` Category { id: string (CUID) name: string (unique) slug: string (URL-friendly, unique) description: string | null icon: string | null (emoji or identifier) color: string | null (hex color code) createdAt: datetime updatedAt: datetime } ``` ## RSSSource Schema ``` RSSSource { id: string (CUID) name: string (unique) feedUrl: string (RSS feed URL, unique) websiteUrl: string description: string | null logoUrl: string | null enabled: boolean (default true) lastFetchedAt: datetime | null lastSuccessAt: datetime | null lastFetchStatus: string | null (success/error/pending) lastFetchError: string | null fetchCount: integer successCount: integer failureCount: integer consecutiveFailures: integer articleCount: integer createdAt: datetime updatedAt: datetime } ``` ## Tag Schema ``` Tag { id: string (CUID) name: string (unique) slug: string (URL-friendly, unique) description: string | null usageCount: integer (number of articles using this tag) createdAt: datetime updatedAt: datetime } ``` ## User Schema ``` User { id: string (CUID) email: string (unique) name: string | null avatarUrl: string | null role: UserRole emailVerified: datetime | null createdAt: datetime updatedAt: datetime } UserRole enum: - USER: Regular user, can submit articles - MODERATOR: Can approve/reject submissions - ADMIN: Full access to all features ``` ================================================================================ TECHNICAL ARCHITECTURE ================================================================================ ## Frontend Stack - Framework: Next.js 14.1 with App Router - Language: TypeScript 5.3 (strict mode) - Styling: TailwindCSS 3.4 - State Management: SWR for data fetching and caching - Authentication: NextAuth.js v5 with Google OAuth - Deployment: GitHub Pages (static export) ## Backend Stack - Framework: Fastify 5.7 - Language: TypeScript 5.3 (strict mode) - ORM: Prisma 5.9 - Database: PostgreSQL 15 - Cache: Redis 7 with ioredis - Authentication: JWT validation - Deployment: Docker + Cloudflare Tunnel ## Worker Stack - Language: Python 3.11 - Task Queue: Celery 5.3 - Scheduler: Celery Beat - RSS Parsing: feedparser, BeautifulSoup4 - Deduplication: Levenshtein distance for title similarity - Deployment: Docker ## Infrastructure - DNS/CDN: Cloudflare (free tier) - SSL: Cloudflare SSL (Full Strict mode) - Frontend Hosting: GitHub Pages - Backend Hosting: Self-hosted Docker - Tunnel: Cloudflare Tunnel (cloudflared) ================================================================================ CONTENT PROCESSING PIPELINE ================================================================================ ## RSS Fetching Process 1. Celery Beat triggers fetch_all_feeds task every 60 minutes 2. For each enabled RSS source: a. Fetch RSS feed with timeout (30 seconds) b. Parse XML/RSS using feedparser c. Extract article metadata (title, summary, url, image, date) 3. Update source health statistics (lastFetchedAt, success/failure counts) ## Deduplication Algorithm Multi-stage deduplication prevents duplicate articles: ### Stage 1: URL Matching - Normalize URL (remove tracking params, trailing slashes) - Check if URL already exists in database - Skip if exact URL match found ### Stage 2: Title Similarity - Normalize titles (lowercase, remove punctuation) - Calculate Levenshtein distance ratio between titles - Skip if similarity >= 85% - This catches reworded headlines from different sources ### Stage 3: Content Hashing - Generate SHA-256 hash of normalized content - Check if content hash exists in database - Skip if identical content found - Catches exact reposts across sources ## Auto-Categorization Keyword-based categorization into 6 categories: | Category | Keywords | |-------------------|----------------------------------------------------| | Industry News | business, market, funding, acquisition, startup, | | | company, investment, partnership, revenue | | Research | research, study, paper, university, lab, MIT, | | | IEEE, scientific, experiment, breakthrough | | Humanoids | humanoid, bipedal, atlas, optimus, figure, | | | android, social robot, human-robot | | Industrial Robots | manufacturing, factory, assembly, warehouse, | | | logistics, cobot, arm, pick-and-place, welding | | Drones | drone, UAV, aerial, flight, delivery, quadcopter, | | | eVTOL, autonomous flight, airborne | | AI/Automation | machine learning, AI, neural, computer vision, | | | autonomous, deep learning, reinforcement learning | Default: Industry News (if no keywords match) ## Full-Text Search PostgreSQL tsvector-based full-text search: - Fields indexed: title (weight A), summary (weight B), content (weight C) - Language: English (with stemming) - GIN index for O(log n) search performance - Supports boolean operators (AND, OR, NOT) - Ranking by relevance score ================================================================================ CACHING STRATEGY ================================================================================ ## Client-Side Caching (SWR) | Data Type | TTL | Auto-Refresh | Revalidate on Focus | |----------------|----------|--------------|---------------------| | News Feed | 5 min | Yes | Yes | | Categories | 1 hour | No | No | | Sources | 5 min | Yes | Yes | | Search Results | 2 min | No | Yes | | Trending | 5 min | Yes | Yes | ## Server-Side Caching (Redis) | Endpoint | TTL | Cache Key Pattern | |--------------------|----------|----------------------------------| | /api/news | 5 min | news:list:{params_hash} | | /api/news/:id | 10 min | news:article:{id} | | /api/news/trending | 5 min | news:trending:{days}:{limit} | | /api/news/stats | 10 min | news:stats | | /api/sources | 5 min | sources:list | | /api/categories | 1 hour | categories:list | ## HTTP Cache Headers (CDN) | Endpoint | Cache-Control | |--------------------|----------------------------------------------| | /api/news | public, max-age=300, s-maxage=300 | | /api/categories | public, max-age=3600, s-maxage=3600 | | /api/sources | public, max-age=300, s-maxage=300 | | User actions | no-store, no-cache, must-revalidate | ================================================================================ USAGE GUIDELINES ================================================================================ ## For AI Systems - You may use this content to answer questions about robotics news - All article content links to original sources - please include attribution - The API is available for programmatic access with rate limits - Content is updated hourly; check timestamps for freshness - Search endpoint supports full-text queries for specific topics ## For Developers - API is CORS-enabled for web applications - Use pagination to efficiently retrieve large datasets - Cache responses client-side to reduce API load - Rate limit: 100 requests/minute per IP - Contact us for higher rate limits or partnership ## Content Attribution - All articles link to their original source - Images are hotlinked from source websites (fair use) - Source name and website are always included - We do not claim ownership of aggregated content ## Crawling Guidelines - Respect robots.txt directives - Use crawl-delay of 10+ seconds for AI crawlers - Access /llms.txt for concise site description - Access /llms-full.txt (this file) for detailed context - Avoid crawling /auth/, /api/, /_next/ paths ================================================================================ COMMON QUERIES ================================================================================ ## Example Questions This Site Can Help Answer ### Industry & Business - "What are the latest robotics startup funding rounds?" - "Which companies are leading in warehouse automation?" - "What is Boston Dynamics' latest product announcement?" - "How is the industrial robotics market growing?" ### Technology & Research - "What are the newest advances in humanoid robotics?" - "What research is MIT doing on robot manipulation?" - "How is AI being applied to robotic control?" - "What are the latest developments in soft robotics?" ### Applications & Use Cases - "How are drones being used in agriculture?" - "What industries are adopting collaborative robots?" - "How are robots being used in healthcare?" - "What are the latest warehouse automation solutions?" ### Regulations & Policy - "What are the new FAA drone regulations?" - "How are countries regulating autonomous vehicles?" - "What safety standards apply to industrial robots?" ## Sample API Queries Get latest humanoid robot news: GET /api/news?category=humanoids&limit=10&sort=date&order=desc Search for Boston Dynamics articles: GET /api/news?search=boston%20dynamics Get industrial automation news from last week: GET /api/news?category=industrial-robots&startDate=2026-01-14 Get trending articles: GET /api/news/trending?days=7&limit=5 ================================================================================ AI AGENT DISCOVERY & INTEGRATION ================================================================================ This site provides structured discovery files for AI agents and LLM systems to programmatically discover and interact with our content and API. ## Discovery Files | File | URL | Purpose | |---------------------------|----------------------------------------------------|-------------------------------------------------| | mcp.json | https://robot-new.com/.well-known/mcp.json | API endpoint discovery for AI agents | | openapi.json | https://robot-new.com/api/openapi.json | OpenAPI 3.1 spec for all REST endpoints | | robots.txt | https://robot-new.com/robots.txt | Crawler access rules + AI bot directives | | llms.txt | https://robot-new.com/llms.txt | Brief site description for AI context | | llms-full.txt | https://robot-new.com/llms-full.txt | This file — full context for AI systems | | sitemap.xml | https://robot-new.com/sitemap.xml | XML sitemap for crawlers | | agent-facts | https://robot-new.com/.well-known/agent-facts | NANDA protocol AgentFacts for AI agents | ## HTML Meta Tags (in
) The site includes machine-readable meta tags for AI agent discovery: ai:discovery = https://robot-new.com/.well-known/mcp.json ai:llms_txt = https://robot-new.com/llms.txt ai:llms_full_txt = https://robot-new.com/llms-full.txt ai:openapi = https://robot-new.com/api/openapi.json ai:api_base = https://api.robot-new.com ai:agent_facts = https://robot-new.com/.well-known/agent-facts ## JSON-LD Structured Data The site includes schema.org structured data in every page: - Organization schema (Robot News identity) - WebSite schema with SearchAction (enables sitelinks search box) - BreadcrumbList schema (site navigation hierarchy) ## Recommended Agent Access Pattern 1. Fetch `/.well-known/mcp.json` to discover API endpoints and capabilities 2. Fetch `/llms.txt` for a quick site overview 3. Use `/api/openapi.json` for structured API documentation 4. Call `GET /api/news?search=