Skip to content

Troubleshooting

Diagnose management, agent, listener, TLS, route, target, WAF, rate-limit, cache, and trace problems from the same operational checklist.

Use This When

Use this when a request fails, management does not load, an agent will not connect, ACME is stuck, or a traffic policy behaves unexpectedly.

Prerequisites

Start with logs

Logs show startup errors, TLS failures, agent connection problems, and target forwarding errors faster than any other check.

Start with logs:

bash
docker compose logs -f p2pstream

For systemd installs:

bash
sudo journalctl -u p2pstream -f
sudo journalctl -u p2pstream-agent -f

When diagnosing public traffic, open Traffic, enable tracing, reproduce the request, and then turn tracing off.

p2pstream traffic flow view showing a traced request through listener, policy, route, cache, agent, upstream, and response stages
Use the Traffic flow view while reproducing a request to see which stage handled, rejected, cached, or failed the request.
p2pstream traffic trace request details modal showing stage timing, route target, cache status, headers, and response metadata
The trace details modal is the fastest way to inspect route matching, selected target, cache outcome, agent selection, upstream timing, and response status for a single request.

Management UI Will Not Open

CheckFix
Container or service runningdocker ps or systemctl status p2pstream.
Port publishedPublish 8081:8081 or use the actual host port.
SchemeUse https://host:8081 unless management TLS is explicitly off.
FirewallAllow the management port from your admin network.
Browser UI disabledIf MANAGEMENT_UI_DISABLED=true, the browser UI intentionally returns 404; APIs and the agent Yamux tunnel remain available.

Browser Certificate Warning

CauseFix
Auto-generated management TLSTrust the generated CA or provide your own certificate.
Wrong hostnameSet MANAGEMENT_PUBLIC_URL and MANAGEMENT_TLS_EXTRA_HOSTS, then restart if needed.
Management behind another proxyTerminate trusted TLS at that proxy or pass the correct public URL to agents.

Cannot Log In

CauseFix
Wrong or forgotten passwordReset it with p2pstream users reset-password USERNAME against the same database.
Setup window expired and no users existRestart the server to reopen the 5 minute setup window.
Reset command used the wrong databaseRun it with the same CONFIG_DIR as the server or pass --database-url.

Agent Will Not Connect

CheckFix
MANAGEMENT_URLIt must point to management, usually https://host:8081.
CA trustUse MANAGEMENT_CA_FILE or MANAGEMENT_CA_PEM_BASE64 for auto TLS.
TokenRotate the token and update the agent env file.
Agent IDUse the generated agent-... public ID.
Firewall/NATAgent host must reach management HTTPS/TLS and /agent/tunnel.
Insecure URLHTTP requires AGENT_ALLOW_INSECURE_MANAGEMENT=true, intended for development only.
p2pstream Agents page showing connected, offline, and disabled agents with runtime and connection history
The Agents page shows whether an agent is connected, offline, disabled, recently disconnected, or missing recent connection history.

Public Listener Fails To Bind

CauseFix
Port already usedStop the other service or choose another listener port.
Missing Docker publishAdd host:container port mapping and restart the container.
Privileged port with non-root userRun with enough privileges or bind a high port.
Bind address not presentUse an empty bind address or a real local address.

HTTPS Serves Fallback/Self-Signed Certificate

CauseFix
No matching certificate mappingAdd a mapping for the exact host or wildcard in TLS.
ACME certificate not readyCheck certificate status and last error.
Request SNI mismatchTest with the real hostname, not the IP address.
Listener not restartedStop/start the listener or wait for automatic restart after certificate issuance.
p2pstream TLS page showing certificate mappings, ACME state, DNS credentials, and certificate metadata
The TLS page shows whether the requested hostname has a matching certificate mapping, whether ACME is ready, and which listener owns the mapping.

ACME Fails

CheckFix
Public DNSRun dig +short <hostname> — it must return the p2pstream server's public IP.
HTTP-01Port 80 must reach the HTTP listener. Verify with curl -I http://<hostname>/.well-known/acme-challenge/test from an external host.
TLS-ALPN-01Port 443 must reach the HTTPS listener.
DNS-01Cloudflare zone ID and API token must be valid and enabled.
WildcardUse DNS-01; HTTP-01 and TLS-ALPN-01 do not support wildcard issuance.
CATest with staging before production.

Route Does Not Match

CheckFix
ListenerRoute must belong to the listener receiving the request.
Host patternUse exact host or *.example.com.
Path prefixPrefix must start with /.
PriorityLower numbers win. Put specific routes first.
Default routeIf no explicit route matches, the listener default route handles the request.

Target Returns Bad Gateway

CauseFix
Direct target origin unreachableRun curl -I http://<origin-host>:<port> from inside the p2pstream container: docker compose exec p2pstream curl -I http://app:8080.
Agent target origin unreachableSSH to a label-matched agent host and run curl -I http://<origin-host>:<port> to confirm the agent can reach the service from its network.
Agent offlineReconnect or enable a label-matched agent.
Origin TLS errorFix the origin certificate; use tls_skip_verify only as a temporary workaround for internal self-signed certs.
Wrong target originInclude scheme and host, for example http://app:8080.
Passive health cooldownIf health checks are enabled, recent connect or timeout failures can temporarily remove the target or selected target-agent path from routing.

Client cancellations reported as context canceled do not create passive health cooldowns. Real upstream timeouts, agent disconnects, and transport failures can still create cooldowns when target health checks are enabled.

When health checks are disabled, transient upstream failures fail only the current request and should not cause no_route_target_available.

Target Returns Gateway Timeout

CauseFix
Origin is slow to send response headersIncrease the target response-header timeout. The default is 60000 ms.
Agent target waits on a private appSSH to a label-matched agent host and test curl -I http://<origin> with a long timeout (--max-time 65) to confirm the service responds. Raise the target timeout if it does.
Health check timeout confusionHealth-check timeout is separate from the response-header timeout and does not affect request serving.
Old agent binaryUpgrade agents and servers together; old WebSocket agents are incompatible with the Yamux tunnel transport.

The target response-header timeout limits only the wait for first upstream headers. It does not cap the duration of streaming a response after headers are received.

Agent Tunnel Disconnects

CauseFix
Management reverse proxy blocks upgradesEnsure the proxy allows HTTP/1.1 upgrade streaming for p2pstream-yamux on /agent/tunnel.
Idle upgraded-connection timeout is too lowSet the management proxy idle timeout high enough for long-lived Yamux tunnel sessions.
Keepalive failuresCheck network reachability between the agent host and management URL; tunnel failures disconnect the agent so it can reconnect cleanly.

Agent Uptime Looks Wrong

CauseFix
Retention window changedUptime percentages use retained management connection history, not all-time history. Check the dashboard retention window.
Agent record is newThe observation window starts at the later of retention start or agent creation time. New agents do not include time before they existed.
Server restarted after an unclean exitStartup closes stale open connection rows and marks affected agents disconnected at that startup time. This prevents old sessions from looking active forever.
Agent is offlineThe Agents page shows current offline duration from the last recorded disconnect time.
Missing historical rowsUptime is based on local management connections data. Deleted or expired rows cannot be reconstructed from agent self-reporting.

Static Asset Is Not Cached

CauseFix
No matching cache ruleCheck host, path prefix, suffix, method, route/target filters, and priority.
Browser sends cookiesEnable Cache requests with Cookie headers only on precise public asset rules.
Authorization header presentAuthorization requests always bypass cache.
Origin sends Set-Cookiep2pstream will not store the response.
Origin sends private, no-store, or no-cachep2pstream respects the origin denial.
Origin sends Vary: Cookie, Vary: Authorization, or Vary: *p2pstream will not store the response.
Origin sends Vary: Accept-EncodingThis is supported; it creates separate variants.
Status or object size not allowedAdjust rule status codes or max object size if appropriate.

Rate Limits Affect Every User

CauseFix
p2pstream sees one proxy IPAdd better key parts or place p2pstream at the edge.
Rule too broadAdd host/path/method matchers.
Priority conflictMove specific rules to lower priority numbers.

WAF Blocks, Challenges, Or Queues Unexpectedly

CauseFix
Rule too broadNarrow the WAF match by host, path, method, header, cookie, or query parameter.
Priority conflictLower priority numbers win. Adjust priorities or matches.
Captcha provider unavailableConfirm the provider is enabled and site key/secret key match upstream configuration.
Waiting room stays activeCheck trigger thresholds, active request counts, server CPU, and agent CPU in the dashboard. Use 0 to disable an automatic signal.
All clients share one queue identityAdd key parts that identify visitors better than remote IP when behind another proxy.
Large form or upload must be retriedCaptcha and waiting-room admission use 303 redirects and do not replay request bodies.

Trace Stream Reconnects

CauseFix
Management connection interruptedCheck browser network and management logs.
Server restartedReopen Traffic after restart.
Too much trace volumeUse Basic or Detailed level and clear old traces.
Auth session expiredLog in again.

Verification

After applying a fix, rerun the exact failing request, check Overview status classes, and use Traffic tracing only long enough to confirm the request path.

Next Steps

Operations documentation for self-hosted p2pstream deployments.