Skip to content

Commit 048c9ec

Browse files
author
Paul C
committed
v22.8.1: local-AI client fixes — fast-fail connect, no stale pool, source-chain errors
KO4BSR reported a 30+s "error sending request for url" against Ollama-on-LAN with qwen2.5:3b in v22.8.0, where curl on the same host worked in 8s. Three changes to the AI HTTP clients (AI_SIMPLE_CLIENT and AiAgent::client): * Drop the `local_address(0.0.0.0)` binding (the inter-node IPv6 bandaid). On hosts with policy routing or multiple default routes it can pick a different egress route than the kernel's default, which is why curl worked but WolfStack didn't. * `pool_max_idle_per_host(0)` — local AI servers run in containers that rotate; a stale pooled connection would stall the next request for the kernel SYN budget (~30s) before failing. Fresh TCP per call removes the stall. * `connect_timeout(5s)` — without it, an unanswered SYN blocks for ~30s before reqwest gives up. Crisp 5s failure now; 120s outer timeout still covers actual inference. Plus diagnostic improvement: new `ai_connection_error` helper walks `reqwest::Error::source()` and appends each cause to the message. Operators now see "tcp connect error — deadline has elapsed" or "connection refused" instead of just the opaque outer wrapper. Verified locally: real Ollama@10.0.3.1 + qwen2.5:3b responds in 1.7s; unreachable host fails in 5.4s with full source chain.
1 parent c9b23ac commit 048c9ec

2 files changed

Lines changed: 51 additions & 7 deletions

File tree

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "wolfstack"
3-
version = "22.8.0"
3+
version = "22.8.1"
44
edition = "2024"
55
authors = ["Wolf Software Systems Ltd"]
66
description = "Server management platform for the Wolf software suite"

src/ai/mod.rs

Lines changed: 50 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,16 +22,55 @@ pub mod baseline;
2222
/// Shared HTTP client for the stateless `simple_chat` entry point
2323
/// (used by plugins and the wolfagents dispatcher). AiAgent owns its
2424
/// own `client` field, so this is only for callers who don't have an
25-
/// AiAgent instance handy. 120s timeout matches the inference latency
26-
/// budget the AI paths expect.
25+
/// AiAgent instance handy.
26+
///
27+
/// Three things matter here, all driven by KO4BSR's v22.8.0 report
28+
/// of curl-works-but-WolfStack-doesn't on `http://<lan-ip>:11434`:
29+
///
30+
/// 1. **No `local_address` binding.** The `ipv4_only_client_builder`
31+
/// helper binds to `0.0.0.0` to skip IPv6 candidates for inter-node
32+
/// polling. On hosts with policy routing or multiple default routes,
33+
/// binding the source address can pick a different egress route
34+
/// than `curl` (which leaves it to the kernel). Plain default →
35+
/// same routing as curl.
36+
/// 2. **No connection pool.** Local AI servers run in containers that
37+
/// rotate. A pooled connection to a stopped container means the
38+
/// next request stalls for the full kernel SYN budget (~30s)
39+
/// before reqwest gives up. `pool_max_idle_per_host(0)` forces a
40+
/// fresh TCP every call — harmless overhead at human-scale call
41+
/// rates, removes the stall.
42+
/// 3. **Fast `connect_timeout`.** Without it, a connect on a
43+
/// non-answering host blocks for the kernel SYN-retry budget
44+
/// (~30s) and the user sees no error until then. 5s gives a
45+
/// crisp failure with a useful message; the 120s outer timeout
46+
/// still covers actual inference.
2747
static AI_SIMPLE_CLIENT: std::sync::LazyLock<reqwest::Client> =
2848
std::sync::LazyLock::new(|| {
29-
crate::api::ipv4_only_client_builder()
49+
reqwest::Client::builder()
50+
.pool_max_idle_per_host(0)
51+
.connect_timeout(std::time::Duration::from_secs(5))
3052
.timeout(std::time::Duration::from_secs(120))
3153
.build()
3254
.unwrap_or_else(|_| reqwest::Client::new())
3355
});
3456

57+
/// Format a reqwest send-error with its full source chain. Reqwest's
58+
/// outer Display message is generic ("error sending request for url
59+
/// ..."); the actual cause (connection refused, operation timed out,
60+
/// broken pipe) lives in `e.source()` and is what an operator
61+
/// actually needs to debug. Walk the chain so the UI shows it.
62+
fn ai_connection_error(url: &str, err: &reqwest::Error) -> String {
63+
use std::error::Error;
64+
let mut msg = format!("Local AI connection failed ({}): {}", url, err);
65+
let mut current: &dyn Error = err;
66+
while let Some(src) = current.source() {
67+
msg.push_str(" — ");
68+
msg.push_str(&src.to_string());
69+
current = src;
70+
}
71+
msg
72+
}
73+
3574
/// Outcome of a single health check. `Ok` drives the alert→OK
3675
/// transition (fire "cleared" notifications on private channels).
3776
/// `Alert` means notifications have already been sent inside
@@ -509,7 +548,12 @@ impl AiAgent {
509548
last_health_check: Mutex::new(None),
510549
alerting_hosts: Mutex::new(std::collections::HashSet::new()),
511550
knowledge_base,
512-
client: crate::api::ipv4_only_client_builder()
551+
// See AI_SIMPLE_CLIENT above for the rationale on each
552+
// setting. AiAgent makes the same kinds of calls so it
553+
// gets the same client recipe.
554+
client: reqwest::Client::builder()
555+
.pool_max_idle_per_host(0)
556+
.connect_timeout(std::time::Duration::from_secs(5))
513557
.timeout(std::time::Duration::from_secs(120))
514558
.build()
515559
.unwrap_or_else(|_| reqwest::Client::new()),
@@ -2373,7 +2417,7 @@ async fn call_local_inner(
23732417
}
23742418

23752419
let resp = req.send().await
2376-
.map_err(|e| format!("Local AI connection failed ({}): {}", url, e))?;
2420+
.map_err(|e| ai_connection_error(&url, &e))?;
23772421

23782422
let status = resp.status();
23792423
let text = resp.text().await.map_err(|e| format!("Local AI response error: {}", e))?;
@@ -2870,7 +2914,7 @@ async fn call_local_no_tools(
28702914
}
28712915

28722916
let resp = req.send().await
2873-
.map_err(|e| format!("Local AI connection failed ({}): {}", url, e))?;
2917+
.map_err(|e| ai_connection_error(url, &e))?;
28742918
let status = resp.status();
28752919
let text = resp.text().await.map_err(|e| format!("Local AI response error: {}", e))?;
28762920
if !status.is_success() {

0 commit comments

Comments
 (0)