User guide
How to use RedFlagger
Step-by-step walkthrough from sign-up to running scans, reading reports, and managing your data.
1. Getting started
RedFlagger is a self-audit tool: you connect your own social accounts via OAuth, run a scan, and see what an algorithmic reviewer might flag. You can't scan anyone else's accounts.
Sign up
- Go to the registration page.
- Enter your email and create a password (at least 12 characters).
- Check your inbox for a one-time verification code.
- You'll land on the dashboard once verified.
Choose a plan
New accounts get free credits to try a scan. To run more, pick a plan from the Plans page. Basic covers most individual self-audit users; Pro is better if you scan often or want full PDF reports.
2. Connect your social accounts
Each platform has slightly different requirements. The connect flow always opens the platform's own OAuth dialog — we never see your password.
Reddit posts are public, so no OAuth is required. When you start a Reddit scan, just enter the username (without the u/ prefix) and confirm consent.
X / Twitter
Same as Reddit — public posts only, no connection required. Enter the handle (without @).
- Open Settings → click Connect next to Facebook.
- You'll be redirected to Facebook's consent screen.
- Authorize the requested permissions (read posts and photos). RedFlagger only reads — we can't post or modify anything.
- You'll be redirected back to Settings with a green "Connected" badge.
Instagram has an extra requirement: your account must be set to Professional (Business or Creator), not Personal. Personal accounts can't use the API at all.
How to switch to Professional
- Open Instagram on your phone.
- Profile → menu → Settings and privacy.
- Account type and tools → Switch to Professional.
- Pick Creator or Business — either works for RedFlagger.
- Once switched, return to RedFlagger Settings and click Connect Instagram.
3. Run your first scan
- Click Scan in the navbar.
- Pick a platform. Connected platforms show a green dot; others are grayed out until you connect them.
- Username field. For OAuth platforms (Facebook, Instagram), this auto-fills with your connected account name and is read-only. For Reddit and X, type the public handle you want to scan.
- Pick categories. Each category is a different lens the AI applies (toxicity, sentiment, identity safety, profanity, image safety). Selecting fewer categories runs faster but skips signals.
- Date range. By default we scan all available history. You can narrow to the last week, month, or a custom range.
- Confirm consent and click Run AI Scan.
4. Read your report
Your report has three parts:
Summary
Top of the page. Total posts scanned, total flags, and an overall risk band (Low / Medium / High). The risk band is just a heuristic — the per-post detail underneath is what matters.
Flagged posts
Each flag shows the post, the category that triggered it, the model's confidence, and the model's reasoning. Click into any flag for the full text.
What to do with flags
- If you agree — the report has done its job. Decide whether to leave, edit, or delete the post on the platform.
- If it's a false positive — click Mark as false positive. Your report updates and the AI's reasoning is hidden.
- If you want context — open the post on the source platform via the permalink. Sometimes seeing it in its original context (a thread, a reply chain) changes how it reads.
Download as PDF
Top-right of the report. Useful for keeping a record before changing posts on the source platform.
5. Understanding false positives
AI content classifiers make mistakes. The most common false-positive categories you'll run into:
- Sarcasm. "Oh great, another Monday" reads as negative sentiment to a model. Context is hard.
- Reclaimed slurs and in-group language. Words used positively within a community can trip toxicity classifiers trained on broad data.
- Quotations and song lyrics. Models often miss the difference between quoting and asserting.
- News commentary. Posting about violent events isn't the same as endorsing them, but classifiers often score them similarly.
- Humor and absurdism. "I want to die, this is so funny" is not a crisis but might score that way.
The point of seeing these flags isn't that you're wrong — it's knowing what an automated tool might surface, so you can decide whether to add context, remove the post, or leave it and have a defense ready.
6. Manage your account
From Settings:
Profile
Change your display name, organization, or password. Email changes require re-verification.
Connected accounts
Each connected platform shows the account it's linked to and when the access token expires. Long-lived Meta tokens (Facebook, Instagram) expire ~60 days after issue — we'll prompt you to reconnect when they get close.
Subjects
You can group multiple scans under a single "subject" (useful when you scan the same handle across different platforms). Subjects can be renamed or deleted at any time.
Notifications
Toggle email alerts for completed scans, expiring connections, and account events.
7. Delete your data
You can remove your data at three different levels of granularity.
Delete a single scan
Open the scan from the dashboard → menu (three dots) → Delete. The scan, all its content items, all its analysis, and any uploaded images are removed immediately and aren't recoverable.
Disconnect a platform
Settings → click Disconnect next to the platform. We delete the encrypted token from our database and attempt a server-side revoke against the platform's permissions API. Existing scans you've already run aren't affected.
Delete your account entirely
Settings → Privacy and data → Delete account. This:
- Removes your profile, all scans, all reports, all OAuth tokens, all subjects, all notifications
- Deletes uploaded images from object storage
- Removes you from our authentication system
Account deletion is queued and completes within 30 days. You'll receive a confirmation email when it's finished.
Still stuck?
Most common issues are listed in the FAQ — error reasons, billing, troubleshooting.