Inheriting a 30K-User RPG Platform: When Your Dice Roll Snake Eyes on Legacy Code
Picture this: you're handed the keys to a platform serving 30,000 passionate RPG players, you spend five months learning the ropes and fixing things, and then—just three weeks ago—the authentication system decides to roll snake eyes spectacularly. Welcome to my life since February 2025, when I inherited SAVAGED.US v3—a character management platform for Savage Worlds that's apparently as wild and unpredictable as the RPG system itself.
What started as "Oh, I'll just maintain this little app" quickly turned into "Why is everything on fire and how did we get to 30,000 users so fast?" Turns out, maintaining a production platform at scale is a lot like running a Savage Worlds campaign—expect the unexpected, plan for critical failures, and always have a backup plan when your players (users) inevitably break something you thought was unbreakable.
The Challenge
Inheriting a production application serving thousands of daily active users is a unique challenge. The SAVAGED.US platform had grown organically over several years, and with nearly 30,000 registered users, systems that once worked perfectly were beginning to show strain. The most critical challenge came just three weeks ago when authentication completely failed due to SQL queries that had grown too large for the expanding user base.
Key Challenges Faced
- Scale-Related Failures: Authentication queries timing out due to user base growth
- Legacy Database Issues: MySQL queries optimized for hundreds, not thousands of users
- Character Data Integrity: Complex character data validation and consistency
- Real-time Performance: WebSocket connections struggling under increased load
- PDF Generation Stability: SVG-to-PDF rendering breaking with complex character data
- Maintenance Continuity: Understanding and maintaining someone else's architecture
- Zero Downtime Requirements: 24/7 availability for international user base
Architecture Decisions
Full-Stack TypeScript
The platform uses TypeScript across the entire stack for several compelling reasons:
// Shared interfaces between client and server
export interface PlayerCharacter {
id: string;
name: string;
attributes: CharacterAttributes;
skills: Skill[];
edges: Edge[];
hindrances: Hindrance[];
}
// Type-safe API responses
export interface SaveCharacterResponse {
success: boolean;
character_id: string;
last_modified: string;
}
The shared type definitions between client and server eliminated an entire class of integration bugs and made refactoring significantly safer.
React with Server-Side Rendering
The frontend uses React 18 with a custom SSR implementation rather than Next.js, giving us fine-grained control over the rendering process:
// Server-side rendering with Express
app.get('*', async (req, res) => {
const html = ReactDOMServer.renderToString(
<Router location={req.url}>
<App />
</Router>
);
res.send(generateHTMLPage(html, req.user));
});
This approach allowed us to integrate deeply with our existing session management and authentication system.
Express + MySQL Backend
The backend architecture follows a clean separation of concerns:
src/server/
├── authRouter.ts # Authentication & session management
├── apiRouter.ts # Core API endpoints
├── contentRouter.ts # Game content delivery
├── adminRouter.ts # Admin-only endpoints
├── web_sockets.ts # Real-time WebSocket handling
└── utils.tsx # Shared utilities
The platform uses MySQL over NoSQL databases because the relational structure of RPG data (characters belong to users, campaigns contain characters, etc.) maps naturally to SQL, and requires ACID compliance for billing operations.
Solving Complex Problems
Character Data Management
RPG character data is inherently complex - deeply nested JSON with interdependent calculations. The platform includes a sophisticated validation and caching system:
class PlayerCharacter {
// Lazy-loaded computed properties
get derivedAttributes(): DerivedAttributes {
return this.calculateDerivedAttributes();
}
// Real-time validation
validateCharacter(): ValidationResult {
const errors: string[] = [];
// Validate attribute constraints
if (this.attributes.total > this.getAttributeLimit()) {
errors.push('Attribute total exceeds limit');
}
// Validate skill prerequisites
this.skills.forEach(skill => {
if (!this.meetsSkillRequirements(skill)) {
errors.push(`Missing prerequisites for ${skill.name}`);
}
});
return { valid: errors.length === 0, errors };
}
}
Real-time Updates with WebSockets
The platform supports real-time notifications and live campaign updates:
// WebSocket connection with session authentication
const ws = new WebSocket(`wss://${domain}/ws`);
ws.on('message', (data) => {
const message = JSON.parse(data);
switch (message.type) {
case 'notification':
showNotification(message.payload);
break;
case 'campaign_update':
updateCampaignState(message.payload);
break;
}
});
Advanced Admin Features
One of the most interesting features is the admin user switching capability, allowing administrators to impersonate users for debugging:
// Admin can switch to any user context
app.post('/admin-api/users-switch', async (req, res) => {
const admin = await getAPIUser(req);
if (!admin?.isAdmin) return res.status(403).json({error: 'Forbidden'});
// Store the override in session
req.session.override_user = req.body.user_id;
res.json({success: true});
});
// All API calls respect the override
async function getAPIUser(req: Request): Promise<User | null> {
const userId = req.session.override_user || req.session.user_id;
return await fetchUser(userId);
}
Performance Optimizations
Intelligent Caching
I implemented a multi-layer caching strategy:
// File-based cache for character generation data
const chargenCache = new Map<string, any>();
async function getCachedChargenData(bookId: string) {
if (chargenCache.has(bookId)) {
return chargenCache.get(bookId);
}
const data = await loadChargenDataFromDB(bookId);
chargenCache.set(bookId, data);
return data;
}
// Cache invalidation on content updates
app.put('/admin-api/content/:id', async (req, res) => {
await updateContent(req.params.id, req.body);
chargenCache.clear(); // Invalidate cache
res.json({success: true});
});
Code Splitting and Lazy Loading
Large admin interfaces are lazy-loaded to keep the initial bundle size manageable:
// Lazy-loaded admin routes
const AdminRouter = lazy(() => import('./admin/_router'));
const ToolsRouter = lazy(() => import('./tools/_router'));
// Route configuration
<Routes>
<Route path="/admin/*" element={
<Suspense fallback={<LoadingSpinner />}>
<AdminRouter />
</Suspense>
} />
</Routes>
Crisis Management and Scaling Solutions
The past five months have been a masterclass in production crisis management and scaling challenges. Here are the major issues I've tackled:
The Great Authentication Crisis (AKA: When SQL Queries Fumble Their Attack Roll)
Three weeks ago, the platform's authentication system rolled a critical failure so spectacular that it would make even the most dramatic Savage Worlds character death look understated. Users couldn't log in, and the cause was a classic scaling challenge—the original queries were perfectly designed for the user base they were written for, but Jeff probably never imagined the platform would grow to 30,000+ users:
// The problematic query that worked fine with 5k users
const getUserPermissions = `
SELECT u.*, p.permissions, r.roles
FROM users u
LEFT JOIN user_permissions p ON u.id = p.user_id
LEFT JOIN user_roles r ON u.id = r.user_id
WHERE u.active = 1
`; // This returned 30,000+ rows and timed out
// The fix: optimized queries with proper indexing
const getSpecificUserPermissions = `
SELECT u.*, p.permissions, r.roles
FROM users u
LEFT JOIN user_permissions p ON u.id = p.user_id
LEFT JOIN user_roles r ON u.id = r.user_id
WHERE u.id = ? AND u.active = 1
LIMIT 1
`; // Now returns 1 row in milliseconds
The lesson: queries that work fine with hundreds of users can become the digital equivalent of a TPK (Total Party Kill) at scale. Who knew?
Database Connection Pool Crisis (When Your Connection Pool Rolls a 1)
The MySQL connection handling that Jeff had set up worked beautifully for the original user base, but wasn't designed for the current load. What happened was users would click "login" and then... nothing. The request would just hang there like a player who can't decide which spell to cast. Turns out the database connection pool was overwhelmed and couldn't handle the concurrent login attempts.
Jeff's original implementation used the default mysql2 settings, which were perfectly reasonable for the platform's initial scale but couldn't handle hundreds of people trying to log in simultaneously during peak gaming hours (apparently Sunday evenings are when everyone decides to prep their characters).
// The original problematic setup (implicit defaults)
const connection = mysql.createConnection({
host: process.env.DB_HOST,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: process.env.DB_DATABASE
}); // No pooling, no timeouts, no error handling
// Fixed the hanging login issue with proper connection pooling
const pool = mysql.createPool({
host: process.env.DB_HOST,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: process.env.DB_DATABASE,
connectionLimit: 20, // Increased from default 10
acquireTimeout: 60000, // 60 seconds timeout
timeout: 60000,
reconnect: true,
charset: 'utf8mb4',
// Additional tuning for high concurrency
queueLimit: 0,
multipleStatements: false,
ssl: false // Disabled for local dev performance
});
// Added comprehensive error handling for connection failures
pool.on('connection', (connection) => {
console.log(`Connected as id ${connection.threadId}`);
// Set session variables for this connection
connection.query("SET SESSION sql_mode='STRICT_TRANS_TABLES,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO'");
});
pool.on('error', (err) => {
console.error('Database connection error:', err);
if (err.code === 'PROTOCOL_CONNECTION_LOST') {
console.log('Attempting to reconnect to database...');
// Connection will be automatically recreated by the pool
} else if (err.code === 'ER_CON_COUNT_ERROR') {
console.error('Too many database connections! Consider increasing connectionLimit.');
}
});
// Added query timeout wrapper for all database operations
const queryWithTimeout = (query: string, params: any[] = []) => {
return new Promise((resolve, reject) => {
const timeout = setTimeout(() => {
reject(new Error(`Query timeout: ${query.substring(0, 50)}...`));
}, 30000); // 30 second timeout
pool.query(query, params, (error, results) => {
clearTimeout(timeout);
if (error) reject(error);
else resolve(results);
});
});
};
This change alone reduced login failures from about 30% to virtually zero. The key was understanding that RPG players are creatures of habit—they all log in around the same time (Sunday evening before game night), so you need to handle those traffic spikes gracefully.
Discord OAuth Integration (Because Everyone Lives in Discord Anyway)
After the great authentication crisis, I realized I needed to diversify login options. Since pretty much every RPG player already lives in Discord (seriously, where else are you going to coordinate your weekly sessions?), adding Discord OAuth seemed like a no-brainer.
But of course, nothing is ever simple when you're dealing with legacy user data. The challenge was handling all the edge cases: existing users with matching emails, users who wanted to link their accounts, dormant accounts that needed reactivation, and the inevitable "I have three Discord accounts and can't remember which one I used" scenarios.
// The full Discord OAuth implementation with all the edge case handling
app.post('/auth/discord', async (req, res) => {
try {
const { code } = req.body;
// Exchange code for Discord user info
const discordUser = await getDiscordUser(code);
if (!discordUser?.email) {
return res.status(400).json({
error: 'Discord account must have a verified email address'
});
}
// Check for existing user by email
let existingUser = await findUserByEmail(discordUser.email);
if (existingUser && existingUser.discord_id) {
// User already linked, just log them in
await updateLastLogin(existingUser.id);
return res.json({ success: true, user: existingUser });
} else if (existingUser && !existingUser.discord_id) {
// Existing user wants to link Discord account
await linkDiscordAccount(existingUser.id, discordUser.id, discordUser.username);
// Reactivate dormant accounts (some users hadn't logged in for years)
if (!existingUser.active) {
await activateUser(existingUser.id);
console.log(`Reactivated dormant user ${existingUser.id} via Discord linking`);
}
existingUser.discord_id = discordUser.id;
await updateLastLogin(existingUser.id);
return res.json({ success: true, user: existingUser, linked: true });
} else {
// Brand new user from Discord
const newUser = await createUserFromDiscord({
discord_id: discordUser.id,
discord_username: discordUser.username,
email: discordUser.email,
display_name: discordUser.global_name || discordUser.username,
avatar_url: discordUser.avatar ?
`https://cdn.discordapp.com/avatars/${discordUser.id}/${discordUser.avatar}.png` :
null
});
// Auto-join them to the Discord server announcements channel
await subscribeToAnnouncements(newUser.id);
return res.json({ success: true, user: newUser, new_user: true });
}
} catch (error) {
console.error('Discord OAuth error:', error);
// Handle specific Discord API errors
if (error.message.includes('invalid_grant')) {
return res.status(400).json({
error: 'Discord authorization expired. Please try again.'
});
}
if (error.message.includes('rate_limit')) {
return res.status(429).json({
error: 'Too many login attempts. Please wait a moment and try again.'
});
}
res.status(500).json({
error: 'Authentication failed',
details: process.env.NODE_ENV === 'development' ? error.message : undefined
});
}
});
// Helper function to get Discord user data
async function getDiscordUser(code: string) {
// Exchange authorization code for access token
const tokenResponse = await fetch('https://discord.com/api/oauth2/token', {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: new URLSearchParams({
client_id: process.env.DISCORD_CLIENT_ID,
client_secret: process.env.DISCORD_CLIENT_SECRET,
grant_type: 'authorization_code',
code: code,
redirect_uri: `${process.env.ENV_URL}/auth/discord/callback`
})
});
if (!tokenResponse.ok) {
throw new Error(`Discord token exchange failed: ${tokenResponse.statusText}`);
}
const tokenData = await tokenResponse.json();
// Get user information
const userResponse = await fetch('https://discord.com/api/users/@me', {
headers: { 'Authorization': `Bearer ${tokenData.access_token}` }
});
if (!userResponse.ok) {
throw new Error(`Discord user fetch failed: ${userResponse.statusText}`);
}
return await userResponse.json();
}
// Handle account linking for existing users
async function linkDiscordAccount(userId: string, discordId: string, discordUsername: string) {
await pool.query(
'UPDATE users SET discord_id = ?, discord_username = ?, linked_at = NOW() WHERE id = ?',
[discordId, discordUsername, userId]
);
// Log the linking event for audit purposes
await pool.query(
'INSERT INTO user_events (user_id, event_type, event_data) VALUES (?, ?, ?)',
[userId, 'discord_linked', JSON.stringify({ discord_id: discordId, discord_username: discordUsername })]
);
}
The Discord integration has been a huge success. About 40% of new registrations now come through Discord, and it's significantly reduced password reset requests (because nobody remembers their SAVAGED.US password, but everyone's already logged into Discord).
Plus, having Discord IDs opens up possibilities for future integrations—like notifications for character sheet updates, game session reminders, or even a Discord bot that can pull character stats during games.
PayPal Integration and Financial Reporting
Added comprehensive financial tracking for the growing subscription base:
interface QuarterlyReport {
period: string;
totalRevenue: number;
activeSubscriptions: number;
newSubscriptions: number;
churned: number;
partnerShares: PartnerShare[];
transactionBreakdown: TransactionSummary[];
}
// Automated reporting for business stakeholders
const generateQuarterlyReport = async (quarter: string): Promise<QuarterlyReport> => {
const transactions = await getPayPalTransactions(quarter);
const revenue = calculateRevenue(transactions);
const partnerShares = calculatePartnerShares(revenue);
return {
period: quarter,
totalRevenue: revenue.total,
activeSubscriptions: revenue.activeCount,
newSubscriptions: revenue.newCount,
churned: revenue.churnedCount,
partnerShares,
transactionBreakdown: revenue.breakdown
};
};
Character Sheet Performance Optimization
Fixed critical rendering issues affecting thousands of character sheets:
// Eliminated duplicate power entries that were causing rendering failures
function deduplicateCharacterElements<T extends { id: string }>(elements: T[]): T[] {
const seen = new Set<string>();
return elements.filter(element => {
if (seen.has(element.id)) {
console.warn(`Duplicate element found: ${element.id}`);
return false;
}
seen.add(element.id);
return true;
});
}
// Applied defensive programming throughout rendering pipeline
const renderCharacterSheet = (character: Character): string => {
try {
const powers = deduplicateCharacterElements(character.powers || []);
const edges = deduplicateCharacterElements(character.edges || []);
const hindrances = deduplicateCharacterElements(character.hindrances || []);
return generateSVG({ ...character, powers, edges, hindrances });
} catch (error) {
console.error('Character sheet rendering failed:', error);
return generateErrorSheet(character.id);
}
};
Lessons Learned from Production Crisis Management
Scale Changes Everything
What I learned taking over a platform at scale: every design decision that worked for 1,000 users needed reevaluation at 30,000. Database queries, connection pooling, caching strategies—everything required rethinking.
Defensive Programming Is Not Optional
With thousands of users depending on the platform daily, defensive programming isn't a nice-to-have—it's essential. Every database query needs error handling, every data transformation needs validation, and every user-facing feature needs graceful degradation.
Monitoring Is Your Best Friend
Jeff had focused on building features that users loved rather than extensive logging (totally understandable—who wants to spend time on boring infrastructure when you could be building cool character management tools?). Adding detailed monitoring and error tracking became crucial for understanding where the platform was breaking under the new load.
// Added comprehensive logging for database performance
const logSlowQueries = (query: string, duration: number) => {
if (duration > 1000) { // Log queries taking more than 1 second
console.warn(`Slow query (${duration}ms):`, query.substring(0, 100));
}
};
// Monitor authentication performance specifically
const loginAttempt = async (email: string) => {
const start = Date.now();
try {
const result = await authenticateUser(email);
logSlowQueries('Login query', Date.now() - start);
return result;
} catch (error) {
console.error('Login failed:', error, { email, duration: Date.now() - start });
throw error;
}
};
Documentation Debt Is Real
Inheriting a codebase without comprehensive documentation meant spending months understanding systems that could have been explained in hours. I've since added extensive documentation for future maintainers.
Community-Driven Development (AKA: Players Make the Best Beta Testers)
Here's the thing about RPG players—they're incredibly dedicated to their characters and surprisingly good at bug reports. When authentication failed, instead of rage-quitting like typical users, they patiently provided detailed descriptions of what went wrong. When character sheets broke, they helped identify patterns in the corruption with the same analytical skills they use to optimize their builds.
But I have to give a massive shoutout to the SAVAGED.US Discord community—these folks have been absolutely incredible. They've been my unofficial QA team, my emotional support network, and my reality check all rolled into one. When I push a fix at 2 AM, they're there testing it. When I accidentally break something (which happens more often than I'd like to admit), they catch it and report it with screenshots, browser details, and step-by-step reproduction instructions that would make professional testers weep with joy.
These Discord heroes have helped me track down bugs I never would have found on my own, tested edge cases I never would have thought of, and provided feedback that's shaped every improvement to the platform. They've turned what could have been a lonely nightmare of maintaining legacy code into a collaborative effort where I actually feel supported.
To everyone in that Discord server who has submitted a bug report, tested a fix, provided feedback, or just offered encouragement when things were broken—thank you. Seriously. You've made this journey infinitely better, and the platform is rock-solid today because of your help. You're the real MVPs of the SAVAGED.US story.
Managing a platform for gamers is like being a GM for 30,000 players simultaneously—they're passionate, they notice everything, and they'll definitely let you know when you've made a mistake. But they're also incredibly supportive when you're trying to fix things. It's the kind of user base that makes the 3 AM debugging sessions actually worth it.
What's Next
The platform continues to evolve with several exciting features in development:
- Campaign Management 2.0: Enhanced real-time collaboration features
- Mobile App: Native iOS/Android apps using React Native
- Third-party Integrations: Roll20, Foundry VTT, and Discord bot integrations
- Advanced Analytics: Player behavior analytics and content usage metrics
Technical Specifications
For those interested in the technical details:
- Frontend: React 18, TypeScript, React Router, Bootstrap
- Backend: Express.js, Node.js, TypeScript
- Database: MySQL with JSON columns for character data
- Real-time: WebSockets with session-based authentication
- Build System: Webpack with custom configurations
- PDF Generation: Puppeteer + SVG-to-PDF conversion
- Deployment: Custom deployment scripts with Discord notifications
Inheriting and scaling SAVAGED.US has been one of the most challenging and educational experiences of my career. It's taught me that maintaining production software at scale is fundamentally different from building new applications. Every line of code affects real users, every optimization impacts real workflows, and every failure disrupts real gaming sessions.
The platform now serves nearly 30,000 users with improved stability, better performance, and robust error handling. Most importantly, it continues to serve the passionate Savage Worlds community that depends on it for their weekly gaming sessions.
You can see the results of this scaling work at SAVAGED.US, where thousands of RPG players create and manage their characters daily—now with the confidence that the platform can handle whatever challenges come next.
A Thank You to Kim
Finally, I need to give massive credit where it's due. Jeff Gordon built something truly special with SAVAGED.US. When I inherited this platform, I didn't just get a codebase—I got a thriving community of 30,000 users who absolutely loved what Jeff had created. The architecture was solid, the features were thoughtful, and the attention to detail in character management was incredible.
Every improvement I've made has been building on Jeff's excellent foundation. The TypeScript implementation was clean and well-structured. The React components were logical and maintainable. The database schema was well-designed for RPG data complexity. Most importantly, Jeff built something that RPG players genuinely wanted to use—no small feat in a world full of half-baked character builders.
The challenges I've faced aren't flaws in Jeff's work—they're the natural growing pains of a successful platform that far exceeded its original scope. That's a good problem to have, and it's a testament to how well Jeff built SAVAGED.US that it could scale to this level with relatively minor adjustments.
To Kim: thank you for putting up with all this gaming nonsense your husband built! I can only imagine the countless hours Jeff spent hunched over his computer, building character sheets and debugging database queries while you probably wondered why anyone needed that many dice rolling algorithms. Your patience and support made it possible for Jeff to create something that brings joy to thousands of RPG players every week. It's been an honor to carry on Jeff's work and help SAVAGED.US continue growing, and I hope I'm doing justice to what you both made possible.
Keep those dice rolling high.