<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://ccomkhj.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://ccomkhj.github.io/" rel="alternate" type="text/html" /><updated>2026-05-03T12:25:19+00:00</updated><id>https://ccomkhj.github.io/feed.xml</id><title type="html">pile of thoughts</title><subtitle>Huijo Kim&apos;s personal site — machine learning, computer vision, MLOps, and reflections.</subtitle><author><name>Huijo Kim</name><email>ccomkhj@gmail.com</email></author><entry><title type="html">KL Divergence, Practically — What I Got Wrong</title><link href="https://ccomkhj.github.io/KL/" rel="alternate" type="text/html" title="KL Divergence, Practically — What I Got Wrong" /><published>2026-01-02T00:00:00+00:00</published><updated>2026-01-02T00:00:00+00:00</updated><id>https://ccomkhj.github.io/KL</id><content type="html" xml:base="https://ccomkhj.github.io/KL/"><![CDATA[<p>I ran into a recurring “KL term” while reading <a href="http://proceedings.mlr.press/v37/rezende15.pdf">Rezende &amp; Mohamed (2015), Variational Inference with Normalizing Flows</a> and realized my mental model was slightly off. I used to treat “KL term” as basically “cross-entropy,” so I thought it was just another classification-like penalty. That belief is <strong>sometimes</strong> practically correct, but it is also <strong>misleading</strong> in the exact context where KL shows up most prominently: variational inference and normalizing flows. This note is my attempt to resolve that confusion.</p>

<p>(Primary motivation: Rezende &amp; Mohamed, 2015. :contentReference[oaicite:0]{index=0})</p>

<hr />

<h2 id="1-what-kl-divergence-actually-is">1. What KL Divergence Actually Is</h2>

<p>KL divergence measures how far one probability distribution is from another:</p>

\[D_{KL}(P\|Q) = \mathbb{E}_{x\sim P}\left[\log \frac{P(x)}{Q(x)}\right].\]

<p>A few properties I need to keep front-of-mind:</p>

<ul>
  <li><strong>Not symmetric</strong>: $D_{KL}(P|Q) \neq D_{KL}(Q|P)$.</li>
  <li><strong>Not a “distance” metric</strong> (no triangle inequality).</li>
  <li>It is an <strong>expectation under $P$</strong>, meaning the direction strongly affects behavior.</li>
</ul>

<p>The direction is not cosmetic: it changes optimization behavior (mode-covering vs mode-seeking).</p>

<hr />

<h2 id="2-where-my-confusion-came-from-kl-vs-cross-entropy">2. Where My Confusion Came From: KL vs Cross-Entropy</h2>

<p>I previously thought:</p>

<blockquote>
  <p>“KL term is basically cross-entropy.”</p>
</blockquote>

<p>This is <strong>only conditionally true</strong>.</p>

<p>The key identity is:</p>

\[D_{KL}(P\|Q) = H(P,Q) - H(P),\]

<p>where</p>

<ul>
  <li>cross-entropy: $H(P,Q) = -\mathbb{E}_{x\sim P}[\log Q(x)]$</li>
  <li>entropy: $H(P) = -\mathbb{E}_{x\sim P}[\log P(x)]$</li>
</ul>

<p>So if <strong>$P$ is fixed</strong>, then $H(P)$ is constant, and minimizing $D_{KL}(P|Q)$ is equivalent to minimizing cross-entropy $H(P,Q)$.</p>

<p>That is why in standard supervised classification with fixed labels, it <em>feels</em> like “cross-entropy = KL”.</p>

<h3 id="the-correction-to-my-belief">The correction to my belief</h3>
<p>The “KL term” is not <em>intrinsically</em> cross-entropy. Cross-entropy is what KL reduces to <strong>when the target distribution $P$ is fixed and we only optimize $Q$</strong>.</p>

<p>That “fixed target” assumption is exactly what breaks in many important ML objectives.</p>

<hr />

<h2 id="3-the-vae--variational-inference-setting-kl-is-not-a-label-loss">3. The VAE / Variational Inference Setting: KL is Not a “Label Loss”</h2>

<p>Rezende &amp; Mohamed frame variational inference as maximizing a lower bound on $\log p(x)$ (the evidence), because the true marginal likelihood is typically intractable. They write the ELBO as:</p>

\[\log p_\theta(x) \ge \mathbb{E}_{q_\phi(z \mid x)}[\log p_\theta(x \mid z)] - D_{KL}(q_\phi(z \mid x)\|p(z)).\]

<p>This is the core equation in the paper. :contentReference[oaicite:1]{index=1}</p>

<p>Here the KL is:</p>

\[D_{KL}(q_\phi(z \mid x)\|p(z)).\]

<p>This is the conceptual point I had wrong: <strong>this KL is not comparing “predictions vs labels.”</strong><br />
It is comparing:</p>

<ul>
  <li>$q_\phi(z \mid x)$: an <em>approximate posterior</em> produced by the inference network (encoder)</li>
  <li>$p(z)$: the prior</li>
</ul>

<p>So the KL term is a <strong>regularizer / information constraint</strong>: it prevents the inference network from encoding arbitrary information in $z$ and pushes the posterior toward the prior.</p>

<h3 id="why-my-cross-entropy-intuition-fails-here">Why my cross-entropy intuition fails here</h3>
<p>In classification, $P$ is typically fixed (true labels).<br />
In the ELBO, <strong>$q_\phi(z \mid x)$ is learned</strong> and changes during training.</p>

<p>So even though the identity $D_{KL} = H(P,Q) - H(P)$ always holds mathematically, the “$H(P)$ is constant” trick is not something I can rely on in intuition. The moving part $q_\phi$ is exactly what I’m optimizing.</p>

<hr />

<h2 id="4-another-important-subtlety-kl-direction-matters">4. Another Important Subtlety: KL Direction Matters</h2>

<p>The ELBO uses:</p>

\[D_{KL}(q_\phi(z \mid x)\|p(z)).\]

<p>This is the “reverse” direction relative to what I mentally associate with supervised learning (which usually resembles $D_{KL}(P|Q)$ with fixed $P$).</p>

<p>This direction has practical consequences:</p>

<ul>
  <li>$D_{KL}(P|Q)$ (“forward KL”) strongly penalizes putting low probability mass where $P$ has mass → tends to be <strong>mode-covering</strong>.</li>
  <li>$D_{KL}(Q|P)$ (“reverse KL”) strongly penalizes putting mass where $P$ has low mass → tends to be <strong>mode-seeking</strong>.</li>
</ul>

<p>In variational inference, this asymmetry is part of why approximate posteriors can miss modes (the “mode-seeking” behavior is a known limitation). Normalizing flows in Rezende &amp; Mohamed are partly motivated by making $q_\phi(z \mid x)$ flexible enough to reduce that approximation gap. :contentReference[oaicite:2]{index=2}</p>

<hr />

<h2 id="5-why-normalizing-flows-make-the-kl-term-even-more-central">5. Why Normalizing Flows Make the KL Term Even More Central</h2>

<p>Rezende &amp; Mohamed propose constructing the approximate posterior by transforming a simple base distribution through a sequence of invertible mappings (“normalizing flow”):</p>

<ul>
  <li>start: $z_0 \sim q_0(z_0 \mid x)$ (often Gaussian)</li>
  <li>transform: $z_k = f_k(z_{k-1})$ for invertible $f_k$</li>
  <li>end: $z_K$ has a more complex distribution $q_K(z_K \mid x)$</li>
</ul>

<p>Because the transform is invertible, the density changes via change-of-variables:</p>

\[\log q_K(z_K \mid x) = \log q_0(z_0 \mid x) - \sum_{k=1}^{K}\log\left|\det \frac{\partial f_k}{\partial z_{k-1}}\right|.\]

<p>This flexibility makes the approximate posterior richer, which tightens the variational bound. This is one of the main contributions of the paper. :contentReference[oaicite:3]{index=3}</p>

<h3 id="my-updated-intuition">My updated intuition</h3>
<p>The KL term is not an “annoying extra penalty.” It is the <strong>explicit mismatch measure</strong> between what my inference model can represent ($q_\phi$) and what the generative model assumes ($p$).<br />
Normalizing flows are an upgrade to $q_\phi$ so that this mismatch can be reduced without sacrificing scalability.</p>

<hr />

<h2 id="6-when-the-target-distribution-is-not-fixed-happens-in-practice">6. When the “Target Distribution is Not Fixed” Happens in Practice</h2>

<p>This directly answers my earlier confusion: <em>when is $P$ not fixed?</em></p>

<p>In variational inference and flows, the “target” distribution in a KL can involve learnable distributions such as \(q_\phi(z \mid x)\) or teacher/student distributions that change over time. Concrete examples:</p>

<ul>
  <li><strong>Variational inference / VAE</strong>: \(q_\phi(z \mid x)\) is learned.</li>
  <li><strong>Normalizing flows in VI</strong>: the whole posterior family is learned through transformations.</li>
  <li><strong>Online distillation / EMA teachers</strong>: teacher distributions evolve during training.</li>
  <li><strong>RL</strong>: policies and visitation distributions shift.</li>
</ul>

<p>This is the world where “KL term = cross-entropy” stops being a reliable mental shortcut.</p>

<hr />

<h2 id="7-the-corrected-takeaway-i-want-to-keep">7. The Corrected Takeaway I Want to Keep</h2>

<p>My previous belief:</p>
<ul>
  <li>“KL is basically cross-entropy.”</li>
</ul>

<p>What I now believe:</p>
<ul>
  <li>KL is a <strong>distribution mismatch measure</strong>.</li>
  <li>Cross-entropy is one <strong>special case view</strong> of KL when the target distribution is fixed.</li>
  <li>In variational inference (including normalizing flows), the KL term is the <em>core regularizer</em> that shapes the posterior approximation, and its direction matters.</li>
</ul>

<p>If I keep that in mind, the ELBO decomposition in Rezende &amp; Mohamed stops looking like “reconstruction loss + random KL penalty” and starts looking like what it really is:</p>

<blockquote>
  <p>a likelihood-fitting term plus an explicit constraint on how much the posterior is allowed to deviate from the prior, with flows making that posterior expressive enough to be useful. :contentReference[oaicite:4]{index=4}</p>
</blockquote>]]></content><author><name>Huijo</name></author><category term="Machine Learning" /><summary type="html"><![CDATA[Why treating KL as 'just cross-entropy' breaks down inside variational inference, and what its asymmetry actually does to optimization (mode-covering vs mode-seeking).]]></summary></entry><entry><title type="html">VAE, Practically — What I Got Wrong</title><link href="https://ccomkhj.github.io/VAE/" rel="alternate" type="text/html" title="VAE, Practically — What I Got Wrong" /><published>2026-01-02T00:00:00+00:00</published><updated>2026-01-02T00:00:00+00:00</updated><id>https://ccomkhj.github.io/VAE</id><content type="html" xml:base="https://ccomkhj.github.io/VAE/"><![CDATA[<p>While reading <a href="http://proceedings.mlr.press/v37/rezende15.pdf">Variational Inference with Normalizing Flows, Rezende &amp; Mohamed (2015)</a>, I noticed my own mental model of VAEs was slightly off. I was carrying an “engineering” intuition that worked for autoencoders, but it created confusion the moment I tried to interpret VAEs as <strong>variational inference</strong> and relate them to forecasting problems like strawberry yield prediction.</p>

<p>This post is my corrected summary, written as a set of “mis-beliefs → corrections,” grounded in the variational inference framing emphasized in the paper.</p>

<hr />

<h2 id="1-my-first-mis-belief-decoder-is-the-posterior">1) My first mis-belief: “Decoder is the posterior”</h2>

<h3 id="what-i-believed">What I believed</h3>
<blockquote>
  <p>The decoder is the posterior.</p>
</blockquote>

<h3 id="whats-actually-true">What’s actually true</h3>
<p>In VAE terminology:</p>

<ul>
  <li>The <strong>posterior</strong> is $p_\theta(z \mid x)$.</li>
  <li>But it is usually intractable in deep generative models.</li>
  <li>So VAEs introduce a tractable approximation $q_\phi(z \mid x)$.</li>
</ul>

<p>That means:</p>

<ul>
  <li>
    <p><strong>Encoder</strong> $\approx$ approximate posterior (inference network):<br />
$q_\phi(z \mid x) \approx p_\theta(z \mid x)$</p>
  </li>
  <li>
    <p><strong>Decoder</strong> $\approx$ likelihood / generative model:<br />
$p_\theta(x \mid z)$</p>
  </li>
</ul>

<p>So the decoder is not “the posterior.” The decoder is <em>one of the ingredients</em> used to define the posterior (via Bayes’ rule), but the direction is the opposite:</p>

<ul>
  <li>Decoder: $z \rightarrow x$ (generate)</li>
  <li>Encoder: $x \rightarrow z$ (infer)</li>
</ul>

<p>This mapping is consistent with the paper’s emphasis: <strong>variational inference replaces posterior inference with optimization over a variational family</strong>, where $q_\phi(z\mid x)$ is the variational distribution and is learned to match $p_\theta(z\mid x)$.</p>

<hr />

<h2 id="2-my-second-mis-belief-encoder-compresses-decoder-expands">2) My second mis-belief: “Encoder compresses, decoder expands”</h2>

<h3 id="what-i-believed-1">What I believed</h3>
<blockquote>
  <p>Encoder compresses a pipeline; decoder expands a pipeline. Therefore, VAE is a fancy compress–decompress trick.</p>
</blockquote>

<h3 id="whats-actually-true-and-why-this-confusion-happens">What’s actually true (and why this confusion happens)</h3>
<p>This “compress / expand” idea comes from image VAEs, where:</p>

<ul>
  <li>$z$ is low-dimensional,</li>
  <li>$x$ is a high-dimensional image,</li>
  <li>and the decoder visually looks like an upsampling network.</li>
</ul>

<p>But for forecasting (e.g., predicting yield as a scalar), “expand” is not a meaningful concept:</p>

<ul>
  <li>The target $y$ is often 1-dimensional.</li>
  <li>The decoder does not necessarily “expand”; it often maps $(x, z)$ to a scalar distribution.</li>
</ul>

<p>So the right mental model is:</p>

<blockquote>
  <p>The encoder and decoder are not defined by dimensionality changes.<br />
They are defined by <em>probabilistic roles</em> in variational inference.</p>
</blockquote>

<ul>
  <li>Encoder: a <strong>recognition model</strong> for approximate posterior inference.</li>
  <li>Decoder: a <strong>generative model</strong> defining the likelihood.</li>
</ul>

<hr />

<h2 id="3-what-vae-inference-really-means-and-why-generative-inference-confused-me">3) What “VAE inference” really means (and why “generative inference” confused me)</h2>

<h3 id="my-confusion">My confusion</h3>
<p>The phrase “generative inference” didn’t make sense to me, because inference should mean “estimate hidden state,” not “generate outputs.”</p>

<h3 id="the-correction">The correction</h3>
<p>In probabilistic modeling, “inference” usually means:<br />
<strong>compute or approximate the posterior over latent variables</strong>.</p>

<p>In a VAE, the true posterior is:</p>

\[p_\theta(z \mid x) = \frac{p_\theta(x \mid z)\,p(z)}{p_\theta(x)}.\]

<p>But VAEs <em>do not compute this directly</em>. Instead, they learn:</p>

\[q_\phi(z \mid x) \approx p_\theta(z \mid x).\]

<p>So:</p>

<ul>
  <li><strong>VAE inference</strong> = run the encoder to obtain $q_\phi(z \mid x)$ (or its parameters).</li>
  <li><strong>Generation / sampling</strong> = sample $z \sim p(z)$ (or a conditional prior) and decode $x \sim p_\theta(x\mid z)$.</li>
</ul>

<p>These are different operations. Inference is “backward reasoning.” Generation is “forward simulation.”</p>

<p>The paper’s message is basically:</p>
<blockquote>
  <p>we make inference scalable by turning it into an optimization problem, and then amortizing it with a neural network.</p>
</blockquote>

<hr />

<h2 id="4-strawberry-yield-forecasting-how-i-map-state-vs-observation-properly">4) Strawberry yield forecasting: how I map “state vs observation” properly</h2>

<p>Now I apply this to a practical problem:</p>

<ul>
  <li>I observe temperature and fruit count.</li>
  <li>I want to forecast strawberry yield.</li>
</ul>

<h3 id="observed-variables-measurements">Observed variables (measurements)</h3>
<p>Let:</p>

<ul>
  <li>$T_{1:t}$ = temperature history up to time $t$</li>
  <li>$C_{1:t}$ = fruit_count history up to time $t$</li>
  <li>$Y_t$ = yield (today, or at harvest)</li>
</ul>

<p>I’ll group the observed covariates as $x$:</p>

\[x = (T_{1:t}, C_{1:t}, \text{engineered features}).\]

<h3 id="latent-state-unobserved-but-important">Latent state (unobserved but important)</h3>
<p>A forecasting model often benefits from a hidden state representing things I <em>don’t measure well</em>:</p>

<ul>
  <li>plant vigor</li>
  <li>stress (heat / water / disease)</li>
  <li>phenological stage</li>
  <li>cultivar or management effects</li>
  <li>microclimate differences</li>
</ul>

<p>Call that latent crop condition $Z_t$.</p>

<p>So, in Bayesian terms, what I actually want is:</p>

\[p_\theta(Z_t \mid x),\]

<p>i.e., “given observed data, what crop states are plausible?”</p>

<p>That is exactly a posterior.</p>

<hr />

<h2 id="5-what-the-decoder-becomes-in-this-forecasting-case">5) What the decoder becomes in this forecasting case</h2>

<p>In a forecasting-oriented VAE (more precisely, a conditional VAE), the decoder is a probabilistic forecast model:</p>

\[p_\theta(Y \mid x, z).\]

<p>This is the “forward” story:</p>

<ul>
  <li>if the crop state is $z$</li>
  <li>and covariates are $x$</li>
  <li>then yield $Y$ follows some distribution.</li>
</ul>

<p>For example, the decoder might output $(\mu_\theta(x,z), \sigma_\theta(x,z))$ for a Gaussian yield distribution.</p>

<p>So yes: the decoder is the forecasting component.<br />
But it is not “expanding” by default—it is <strong>defining a likelihood</strong>.</p>

<hr />

<h2 id="6-then-what-is-the-encoder-in-this-forecasting-case">6) Then what is the encoder in this forecasting case?</h2>

<p>During training, I have both covariates and yield, so I can infer what latent crop state best explains the outcome:</p>

\[q_\phi(z \mid x, y).\]

<p>Interpretation:</p>
<blockquote>
  <p>“Given sensors and realized yield, what hidden crop condition must have been present?”</p>
</blockquote>

<p>That’s posterior inference (approximate), and it is the core meaning of “VAE inference.”</p>

<p>This is aligned with Rezende &amp; Mohamed’s viewpoint:<br />
the inference model $q_\phi$ is trained to approximate the true posterior while keeping optimization tractable.</p>

<hr />

<h2 id="7-the-forecasting-time-detail-i-initially-missed-i-need-a-conditional-prior">7) The forecasting-time detail I initially missed: I need a conditional prior</h2>

<p>A key practical point:</p>

<p>At prediction time, I do not know $y$, so I cannot directly use $q_\phi(z \mid x, y)$.</p>

<p>To forecast, I need a distribution over latent states given only covariates:</p>

\[p_\psi(z \mid x),\]

<p>sometimes called a <strong>conditional prior</strong>.</p>

<p>Then forecasting is:</p>

<p>1) Sample latent crop states: $z^{(k)} \sim p_\psi(z \mid x)$<br />
2) Decode yields: $y^{(k)} \sim p_\theta(y \mid x, z^{(k)})$<br />
3) Aggregate samples to get a predictive distribution.</p>

<p>This is how the model produces:</p>

<ul>
  <li>a mean forecast</li>
  <li>prediction intervals</li>
  <li>multi-modal outcomes (if relevant)</li>
</ul>

<hr />

<h2 id="8-comparing-this-to-lightgbm-the-baseline-that-keeps-me-honest">8) Comparing this to LightGBM (the baseline that keeps me honest)</h2>

<p>LightGBM is typically:</p>

\[\hat{y} = f_{\text{LGBM}}(x),\]

<p>a direct mapping from engineered covariates to yield.</p>

<p>A VAE-style forecast is instead:</p>

\[z \sim p_\psi(z \mid x), \quad y \sim p_\theta(y \mid x, z).\]

<p>This difference matters when:</p>

<ul>
  <li>there are hidden factors not captured by $x$,</li>
  <li>the same $x$ maps to multiple plausible yields,</li>
  <li>or uncertainty calibration is important (operations planning, labor scheduling, contracts).</li>
</ul>

<p>If none of those are true, LGBM is usually the better engineering choice.</p>

<hr />

<h2 id="9-my-corrected-takeaways-the-short-list">9) My corrected takeaways (the short list)</h2>

<h3 id="what-i-used-to-think">What I used to think</h3>
<ul>
  <li>Encoder compresses</li>
  <li>Decoder expands</li>
  <li>Decoder is the posterior</li>
  <li>“Generative inference” means predicting</li>
</ul>

<h3 id="what-i-now-think">What I now think</h3>
<ul>
  <li>Encoder is the <strong>approximate posterior</strong>: $q_\phi(z \mid \cdot)$</li>
  <li>Decoder is the <strong>likelihood / generative model</strong>: $p_\theta(\cdot \mid z)$</li>
  <li>In forecasting, decoder is best seen as a <strong>probabilistic forecaster</strong>, not an “expander”</li>
  <li>“VAE inference” means <strong>latent state inference</strong> via the encoder</li>
  <li>For forecasting, I often need a <strong>conditional prior</strong> $p_\psi(z \mid x)$ to sample latent states when $y$ is unknown</li>
</ul>

<hr />

<h2 id="10-one-sentence-that-finally-fixed-my-mental-model">10) One sentence that finally fixed my mental model</h2>

<p>A VAE-style forecaster for strawberry yield is:</p>

<blockquote>
  <p>a model that learns a latent crop-condition variable $z$ and uses variational inference (via an encoder) to approximate the posterior over $z$, while a decoder defines a probabilistic forecast $p_\theta(y \mid x, z)$ that can be sampled for uncertainty-aware predictions.</p>
</blockquote>

<p>This framing made the Rezende &amp; Mohamed (2015) motivation click:<br />
<strong>variational inference is the mathematical reason VAEs exist, and “encoder/decoder” are just neural parameterizations of the variational posterior and the likelihood.</strong></p>]]></content><author><name>Huijo</name></author><category term="Machine Learning" /><summary type="html"><![CDATA[A corrected mental model for VAEs, written as 'mis-belief → correction' notes after re-reading Rezende & Mohamed (2015) and reframing the encoder/decoder as variational inference rather than autoencoding.]]></summary></entry><entry><title type="html">Evaluate agents</title><link href="https://ccomkhj.github.io/EvaluateAgent/" rel="alternate" type="text/html" title="Evaluate agents" /><published>2025-12-22T00:00:00+00:00</published><updated>2025-12-22T00:00:00+00:00</updated><id>https://ccomkhj.github.io/EvaluateAgent</id><content type="html" xml:base="https://ccomkhj.github.io/EvaluateAgent/"><![CDATA[<p>Parts of an Agent You need to Evaluate</p>

<ul>
  <li>Routers: Function choice and parameter extraction
    <ul>
      <li>Did it call the right skills based on the scenario?</li>
    </ul>
  </li>
  <li>Skills: Can use standard LLM evaluations
    <ul>
      <li>Embed input query</li>
      <li>Vector DB lookup</li>
      <li>LLM call with retrieved context</li>
    </ul>
  </li>
  <li>Path: The most challenging to evaluate at scale</li>
</ul>

<p>How to evaluate these components</p>

<ul>
  <li>LLM as a Judge: Using other LLMs to evaluate
    <ul>
      <li>It will never e a 100% correct</li>
      <li>Tuning your LLM judge prompt can help close this gap</li>
      <li>Always use discrete classification labels (incorrect vs correct, not 1-100% accuracy)</li>
      <li>Foolow from 13:45 (https://www.youtube.com/watch?v=LpbGpJhndQ0)</li>
    </ul>
  </li>
  <li>Code-based Evals: Using traditional code checks</li>
  <li>Human Feedback: Using end-user or human labeler feedback</li>
</ul>]]></content><author><name>Huijo</name></author><category term="Machine Learning" /><summary type="html"><![CDATA[Notes on the parts of an agent worth evaluating — routers, skills, and paths — and the methods (LLM-as-judge, code checks, human feedback) that scale to each.]]></summary></entry><entry><title type="html">Role of Experience in AI era</title><link href="https://ccomkhj.github.io/RoleOfExperienceInAI/" rel="alternate" type="text/html" title="Role of Experience in AI era" /><published>2025-11-20T00:00:00+00:00</published><updated>2025-11-20T00:00:00+00:00</updated><id>https://ccomkhj.github.io/RoleOfExperienceInAI</id><content type="html" xml:base="https://ccomkhj.github.io/RoleOfExperienceInAI/"><![CDATA[<p>LLMs are ambitious when it comes to judging whether a task is feasible or not.
Because they don’t really reason—they mostly infer likely context based on patterns in their training data—most of what they “know” is how things are supposed to work. But the ability to estimate what is actually possible comes from experience, not from examples. Knowing the underlying science is essential, but what’s even rarer is the ability to take a problem, estimate the best way to solve it, build a plan to tackle it with ML, and confidently execute that plan. This skill usually comes only after multiple overly ambitious projects and missed deadlines.</p>

<p>How do you decide what to focus on in an ML project?
You need to find the impact bottleneck: the part of the pipeline that would provide the most value if improved. When working with companies, I often find that they’re not working on the right problem—or they’re not even at the growth stage where that problem matters.
There are often issues around the model, but the best way to find them is to temporarily replace the model with something simple and debug the entire pipeline. Very often the real issue isn’t model accuracy at all. Frequently, the product is dead even if the model works perfectly.</p>

<p>Once you have the whole pipeline, how do you identify the impact bottleneck?
Imagine that the bottleneck is fully solved and ask yourself: Was it worth the effort it would take to fix it?
It’s also incredibly valuable to manually inspect your model’s inputs and outputs. Scroll through a bunch of examples and see if anything looks strange. My department head at IBM had a mantra: do something manually for an hour before doing any real engineering work.
Curious whether your project is achievable with ML?
Or whether ML is even needed in the first place?
Contact me—I’m happy to help you find the right direction.</p>]]></content><author><name>Huijo</name></author><category term="Philosophy" /><summary type="html"><![CDATA[Why LLMs are unreliable judges of feasibility, and how to find the impact bottleneck of an ML project before tuning the model.]]></summary></entry><entry><title type="html">Causal Inference in Plants</title><link href="https://ccomkhj.github.io/CausalInferenceinPlants/" rel="alternate" type="text/html" title="Causal Inference in Plants" /><published>2025-11-11T00:00:00+00:00</published><updated>2025-11-11T00:00:00+00:00</updated><id>https://ccomkhj.github.io/CausalInferenceinPlants</id><content type="html" xml:base="https://ccomkhj.github.io/CausalInferenceinPlants/"><![CDATA[<p>I have built a yield forecasting model using LightGBM.
It was working surprisingly well, way more accurate than agriculture experts who are physically in the farm.</p>

<p>It predicts strawberry and tomato yields based on growing conditions like temperature, light, CO₂, and humidity.<br />
The model performs well in forecasting, but I realized it only learns <strong>correlations</strong>, not <strong>causal relationships</strong>.<br />
So even if it predicts yield accurately, it can not explain what will actually happen <strong>if the farm operator changes</strong> temperature or light in a real greenhouse for the period of time.</p>

<p>To go beyond prediction, I started building a <strong>simulation system</strong> that recommends growing conditions due to customer demands.
As the farm can truly reduce the energy cost per kg of production by knowing the impact of the grow condition.
For that, I needed <strong>causal inference</strong> — a way to estimate how much yield changes when I intervene on variables (like raising temperature by +2°C).
Imagine how much heating 2°C more for 10 ha facility will cost.</p>

<h2 id="1-setting-up-causal-modeling">1. Setting up causal modeling</h2>

<p>I began by drawing a simple <strong>causal graph (DAG)</strong> that represents how different factors affect yield:</p>
<ul>
  <li>Temperature and light directly influence yield.</li>
  <li>CO₂, humidity (VPD), irrigation, temperature, and cultivar affect yield (so they’re confounders).</li>
  <li>Some variables, like energy usage, are side effects and should not be adjusted for.</li>
</ul>

<p>This step helped me define what needs to be controlled to get valid causal effects.</p>

<h2 id="2-integrating-with-dowhy">2. Integrating with DoWhy</h2>

<p>I used <strong>DoWhy</strong>, a causal inference library, to wrap around my LightGBM model.<br />
DoWhy helps connect the model to causal concepts:</p>

<ol>
  <li><strong>Model</strong> — define the DAG fully based on plant science. This is where I had to be an plant scientist.</li>
  <li><strong>Identify</strong> — find which variables to adjust for (backdoor criterion).</li>
  <li><strong>Estimate</strong> — use LightGBM as the outcome and treatment model.</li>
  <li><strong>Refute</strong> — test how stable and believable the effect is.</li>
</ol>

<p>My LightGBM still does the heavy lifting for nonlinear relationships, while DoWhy handles the logic of “<em>what if</em>” changes.</p>

<h2 id="3-checking-trustworthiness">3. Checking trustworthiness</h2>

<p>A big question was: can I trust these causal effects?</p>

<p>I tested this with:</p>
<ul>
  <li><strong>Placebo refuters</strong> — replace temperature with random noise.<br />
→ The estimated effect disappeared (good sign).</li>
  <li><strong>Subset refuters</strong> — re-run the model on random data splits.<br />
→ Results stayed consistent.</li>
  <li><strong>Sensitivity tests</strong> — check how strong a hidden confounder must be to change conclusions.<br />
→ Effects were robust unless confounding was extreme.</li>
</ul>

<p>These tests gave me confidence that the model captures real causal influence, not just correlation noise.</p>

<h2 id="4-results-and-next-steps">4. Results and next steps</h2>

<p>Now, the system can simulate different environmental settings and show <strong>expected yield changes</strong> under those interventions.<br />
For example, “if I increase PAR by 15% by turning on LED while keeping CO₂ constant, yield increases by ~6%.”<br />
This gives a foundation for <strong>growing condition recommendations</strong>.</p>

<p>Next step would be:</p>
<ul>
  <li>Combine multiple cultivars to learn condition-specific responses.</li>
  <li>Validate the system with controlled greenhouse trials.</li>
</ul>

<h2 id="5-reflection">5. Reflection</h2>

<p>This process taught me that causal inference is not just a statistical trick — it’s a framework for thinking.<br />
The key was translating plant science knowledge into a graph of cause and effect, then testing those assumptions systematically.<br />
Thanks to this work, farmers can not only predict yield, but also simulate <strong>what happens if we change the environment</strong> — a step closer to autonomous, data-driven cultivation.</p>]]></content><author><name>Huijo</name></author><category term="Machine Learning" /><summary type="html"><![CDATA[Adding causal inference (DoWhy + a hand-drawn DAG) on top of a LightGBM yield model so a greenhouse simulator can answer 'what happens if we raise temperature 2°C?' instead of just predicting tomorrow's yield.]]></summary></entry><entry><title type="html">Kubernetes Command Cheat Sheet</title><link href="https://ccomkhj.github.io/k8s/" rel="alternate" type="text/html" title="Kubernetes Command Cheat Sheet" /><published>2025-11-02T00:00:00+00:00</published><updated>2025-11-02T00:00:00+00:00</updated><id>https://ccomkhj.github.io/k8s</id><content type="html" xml:base="https://ccomkhj.github.io/k8s/"><![CDATA[<p>A clean and practical reference for managing and troubleshooting Kubernetes clusters.<br />
These are the commands I use most often.</p>

<h1 id="-pods-inspect-logs-exec-manage">📦 Pods (Inspect, Logs, Exec, Manage)</h1>

<h2 id="-list--inspect-pods">🔍 List &amp; Inspect Pods</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods
kubectl get pods <span class="nt">-n</span> &lt;namespace&gt;
kubectl get pods <span class="nt">--all-namespaces</span>
kubectl describe pod &lt;pod-name&gt;
</code></pre></div></div>

<h3 id="pods-on-a-specific-node">Pods on a specific node</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods <span class="nt">-A</span> <span class="nt">-o</span> wide | <span class="nb">grep</span> &lt;node&gt;
</code></pre></div></div>

<h3 id="pod-resource-requests">Pod resource requests</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods <span class="nt">-n</span> &lt;ns&gt; <span class="nt">-o</span> json | jq <span class="nt">-r</span> <span class="s1">'.items[] |
  "\(.metadata.name)\tCPU:\(.spec.containers[0].resources.requests.cpu)\tMEM:\(.spec.containers[0].resources.requests.memory)"'</span>
</code></pre></div></div>

<h3 id="pod-tolerations">Pod tolerations</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods <span class="nt">-n</span> &lt;ns&gt; <span class="nt">-o</span> json | jq <span class="nt">-r</span> <span class="s1">'.items[] |
  "\(.metadata.name)\t\(.spec.tolerations | map(.key) | join(","))"'</span>
</code></pre></div></div>

<h3 id="pod-events">Pod events</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl describe pod &lt;pod&gt; <span class="nt">-n</span> &lt;ns&gt; | <span class="nb">grep</span> <span class="nt">-A</span> 10 <span class="s2">"Events:"</span>
</code></pre></div></div>

<h3 id="pod-resource-usage-metrics-server">Pod resource usage (metrics-server)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl top pods <span class="nt">-n</span> &lt;ns&gt;
</code></pre></div></div>

<h2 id="-logs">🪵 Logs</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl logs &lt;pod&gt;
kubectl logs &lt;pod&gt; <span class="nt">-c</span> &lt;container&gt;
kubectl logs <span class="nt">-f</span> &lt;pod&gt;
kubectl logs <span class="nt">-l</span> <span class="nv">app</span><span class="o">=</span>&lt;label&gt;
</code></pre></div></div>

<p>one-liner sample to load with a key-word</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl logs <span class="nt">-n</span> name_space <span class="s2">"</span><span class="si">$(</span>kubectl get pods <span class="nt">-n</span> name_space <span class="nt">--no-headers</span> | <span class="nb">awk</span> <span class="s1">'/prod/ {print $1; exit}'</span><span class="si">)</span><span class="s2">"</span>
</code></pre></div></div>

<h2 id="-exec--shell-access">💻 Exec / Shell Access</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl <span class="nb">exec</span> <span class="nt">-it</span> &lt;pod&gt; <span class="nt">--</span> /bin/bash
kubectl <span class="nb">exec</span> <span class="nt">-it</span> &lt;pod&gt; <span class="nt">--</span> /bin/sh
kubectl <span class="nb">exec</span> <span class="nt">-it</span> &lt;pod&gt; <span class="nt">-c</span> &lt;container&gt; <span class="nt">--</span> /bin/bash
</code></pre></div></div>

<h2 id="-pod-management">🧹 Pod Management</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl delete pod &lt;pod&gt; <span class="nt">-n</span> &lt;ns&gt;
kubectl delete pod &lt;pod&gt; <span class="nt">-n</span> &lt;ns&gt; <span class="nt">--force</span> <span class="nt">--grace-period</span><span class="o">=</span>0
kubectl delete pod <span class="nt">--all</span> <span class="nt">-n</span> &lt;ns&gt;
kubectl delete pod <span class="nt">-n</span> &lt;ns&gt; <span class="nt">-l</span> <span class="nv">key</span><span class="o">=</span>value
</code></pre></div></div>

<h1 id="-services">🌐 Services</h1>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get services
kubectl get services <span class="nt">-n</span> &lt;namespace&gt;
kubectl get services <span class="nt">--all-namespaces</span>
</code></pre></div></div>

<h1 id="-nodes-inspect-usage-manage">🧠 Nodes (Inspect, Usage, Manage)</h1>

<h2 id="-node-info">🔍 Node Info</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get nodes
kubectl get nodes <span class="nt">-o</span> wide
kubectl describe node &lt;node&gt;
</code></pre></div></div>

<h3 id="node-labels-taints-instance-type">Node labels, taints, instance type</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get node &lt;node&gt; <span class="nt">-o</span> json | jq <span class="s1">'.metadata.labels'</span>
kubectl describe node &lt;node&gt; | <span class="nb">grep </span>Taints
kubectl get node &lt;node&gt; <span class="nt">-o</span> json | jq <span class="nt">-r</span> <span class="s1">'.metadata.labels."node.kubernetes.io/instance-type"'</span>
</code></pre></div></div>

<h2 id="-node-resource-usage">📊 Node Resource Usage</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl describe node &lt;node&gt; | <span class="nb">grep</span> <span class="nt">-A</span> 5 <span class="s2">"Allocated resources"</span>
kubectl top nodes
</code></pre></div></div>

<h2 id="-node-management-cordon--drain--delete">🛠 Node Management (Cordon / Drain / Delete)</h2>

<h3 id="-cordon--stop-scheduling-new-pods">🔹 Cordon — stop scheduling new pods</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl cordon &lt;node&gt;
</code></pre></div></div>

<h3 id="-drain--evict-workloads-safely">🔹 Drain — evict workloads safely</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl drain &lt;node&gt; <span class="nt">--ignore-daemonsets</span> <span class="nt">--delete-emptydir-data</span> <span class="nt">--force</span>
</code></pre></div></div>

<h3 id="-delete--remove-node-from-kubernetes">🔹 Delete — remove node from Kubernetes</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl delete node &lt;node&gt; <span class="nt">--force</span> <span class="nt">--grace-period</span><span class="o">=</span>0
</code></pre></div></div>

<h3 id="-quick-comparison">📘 Quick Comparison</h3>

<table>
  <thead>
    <tr>
      <th>Action</th>
      <th>Stops New Pods?</th>
      <th>Evicts Pods?</th>
      <th>Removes Node?</th>
      <th>Use Case</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>cordon</strong></td>
      <td>✔️</td>
      <td>❌</td>
      <td>❌</td>
      <td>Prepare for maintenance</td>
    </tr>
    <tr>
      <td><strong>drain</strong></td>
      <td>✔️</td>
      <td>✔️</td>
      <td>❌</td>
      <td>Safely remove workloads</td>
    </tr>
    <tr>
      <td><strong>delete</strong></td>
      <td>✔️</td>
      <td>✔️ (node gone)</td>
      <td>✔️</td>
      <td>Node is broken or gone</td>
    </tr>
  </tbody>
</table>

<h1 id="-storage-pvc">🧱 Storage (PVC)</h1>

<h3 id="list-pvcs">List PVCs</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pvc <span class="nt">-A</span>
kubectl get pvc <span class="nt">-n</span> &lt;namespace&gt;
</code></pre></div></div>

<h3 id="describe-a-pvc">Describe a PVC</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl describe pvc &lt;pvc-name&gt; <span class="nt">-n</span> &lt;namespace&gt;
</code></pre></div></div>

<h3 id="check-volume-affinity-issues">Check volume affinity issues</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl describe pod &lt;pod&gt; <span class="nt">-n</span> &lt;ns&gt; | <span class="nb">grep</span> <span class="nt">-i</span> volume
</code></pre></div></div>

<h1 id="-pod-disruption-budgets-pdb">⛑ Pod Disruption Budgets (PDB)</h1>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pdb <span class="nt">-A</span>
kubectl get pdb <span class="nt">-n</span> &lt;namespace&gt;
kubectl describe pdb &lt;pdb&gt; <span class="nt">-n</span> &lt;namespace&gt;
</code></pre></div></div>

<h1 id="-useful-commands">🔧 Useful Commands</h1>

<h3 id="count-pods-on-a-node">Count pods on a node</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods <span class="nt">-A</span> <span class="nt">-o</span> wide | <span class="nb">grep</span> &lt;node&gt; | <span class="nb">wc</span> <span class="nt">-l</span>
</code></pre></div></div>

<h3 id="count-pods-by-namespace">Count pods by namespace</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods <span class="nt">-A</span> <span class="nt">--no-headers</span> | <span class="nb">awk</span> <span class="s1">'{print $1}'</span> | <span class="nb">sort</span> | <span class="nb">uniq</span> <span class="nt">-c</span>
</code></pre></div></div>

<h3 id="list-instance-types-in-the-cluster">List instance types in the cluster</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get nodes <span class="nt">-o</span> json | jq <span class="nt">-r</span> <span class="se">\</span>
<span class="s1">'.items[].metadata.labels."node.kubernetes.io/instance-type"'</span> | <span class="nb">sort</span> | <span class="nb">uniq</span>
</code></pre></div></div>

<h3 id="node--workload-mapping">Node → workload mapping</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get nodes <span class="nt">-o</span> json | jq <span class="nt">-r</span> <span class="s1">'.items[] |
  "\(.metadata.name)\t\(.metadata.labels.workload // "none")\t\(.metadata.labels."node.kubernetes.io/instance-type")"'</span>
</code></pre></div></div>]]></content><author><name>Huijo</name></author><category term="Programing" /><summary type="html"><![CDATA[A practical kubectl reference — the commands I reach for most when inspecting pods, debugging nodes, and managing clusters.]]></summary></entry><entry><title type="html">Self Image: The way you conceive yourself is what you will be.</title><link href="https://ccomkhj.github.io/SelfImage/" rel="alternate" type="text/html" title="Self Image: The way you conceive yourself is what you will be." /><published>2025-11-01T00:00:00+00:00</published><updated>2025-11-01T00:00:00+00:00</updated><id>https://ccomkhj.github.io/SelfImage</id><content type="html" xml:base="https://ccomkhj.github.io/SelfImage/"><![CDATA[<p>I received dramatically different view on my characteristic depends on when my friends met me.
Frends who met me during freshmen, sophomore in college, they remember me as really social boy.
I used to lead the events, I was not hesitant to hang out with new friends.</p>

<p>At the same time, friends who met me after my age of 25, draw my image as far-from talkative.
I used to listen more than talk, I did not lead the party.</p>

<p>I am a same person, I did not suffer from any trauma. What made me to behave so different?
My answer used to be “I am naturally introverted, but I tried to be extrovert in the beginning of my college life, as I wanted to be.”
So, as nike says, “Fake it till make it.”</p>

<p>After reading the book-“Psycho Cybenetics by Maxwell Maltz”, I now have a clear view why it happened.
I believe, I am open-minded for taking uncertainties, challenges in my life because I have quitted multiple journeys.
Studying again after Droping the nice job in Korea.
Founding a startup after rejecting the offer from well-known offers.
Exiting at the moment of my company’s growing to pursue the bigger vision.</p>]]></content><author><name>Huijo</name></author><category term="Philosophy" /><category term="Reading" /><summary type="html"><![CDATA[Why the same person looked extroverted to college friends and introverted to friends from age 25 — and what *Psycho-Cybernetics* says about self-image as the real driver behind that gap.]]></summary></entry><entry><title type="html">Punished by Rewards: What truly motivates your team? It’s not rewards.</title><link href="https://ccomkhj.github.io/PunishedByRewards/" rel="alternate" type="text/html" title="Punished by Rewards: What truly motivates your team? It’s not rewards." /><published>2025-10-18T00:00:00+00:00</published><updated>2025-10-18T00:00:00+00:00</updated><id>https://ccomkhj.github.io/PunishedByRewards</id><content type="html" xml:base="https://ccomkhj.github.io/PunishedByRewards/"><![CDATA[<p>Through running businesses and working in both large corporations and startups,<br />
one thing is clear to me: people make a business move forward or backward.<br />
So, it’s key to (1) hire the right people and (2) manage the team right.</p>

<p>In my first startup, I was obsessed with hiring the right people, believing that once you hire the right ones, they will succeed.<br />
In retrospect, hiring the right people is necessary but not sufficient for success.</p>

<p>After stepping away from the hectic founder life, I had the chance to nurture my entrepreneurial spirit through books.<br />
In this writing, I’m going to summarize my takeaways from the book <em>Punished by Rewards</em> by Alfie Kohn, to remind my future self of its insights.</p>

<p>The book’s focus is on how to motivate employees (and similarly, students for educators or children for parents).<br />
It’s proven that both rewards and punishments are harmful to genuine motivation — supported by hundreds of research references.<br />
Rewards include things like bonuses tied to KPIs. If bonuses are inevitable, they should not be announced beforehand, but rather given unexpectedly — so employees don’t work just for the money.</p>

<p>While reading the 200+ pages (in a few days), I kept asking myself: if rewards are bad, what can a leader use instead?<br />
The author argues that intrinsic motivation is the true driver of lasting success.<br />
The main reason is simple: no company can provide infinite (more and more) financial rewards.<br />
Moreover, when the bonus amount is small, performance actually drops.<br />
When the bonus is large, it motivates temporarily — but once it’s removed, performance falls sharply.</p>

<p>The author calls this the <strong>“If-Then” trap</strong> — “If you do this, then you’ll get that.”<br />
This is not just about money; it’s about any conditional reward, including praise, perks, or privileges.<br />
The real problem is not the reward itself, but the message behind it: that people must be <em>controlled</em>.<br />
When rewards are used to manipulate behavior, they kill genuine interest and trust.</p>

<p>The problem with financial incentives isn’t that people are offered too much money; the problem is that money becomes too salient.<br />
It’s driven by the principle of “Do this, and you’ll get that.”<br />
You can’t just remove the carrot and stick; you must rebuild the system so people have genuine reasons to care.</p>

<p>So, what creates intrinsic motivation?<br />
We need to focus on the <strong>3Cs</strong>: <strong>Collaboration, Content, and Choice.</strong></p>

<p>Everyone feels great and motivated when collaborating with others — in business, that means with colleagues.<br />
If your support helps your colleague get a bonus while you don’t, it’s natural to feel less eager to prioritize helping next time, even if it benefits the business overall.</p>

<p>If what you’re working on is meaningful to you, you’re naturally motivated.<br />
That’s why, when I was running an agriculture startup, my team was deeply motivated — we were solving problems related to food and the planet.<br />
I believe NGOs can attract top talent for the same reason: the work itself is fulfilling.</p>

<p>Another critical point is <strong>autonomy</strong>, or the power of <em>Choice</em>.<br />
True motivation flourishes when people feel they have control over their work — how they do it, when they do it, and what problems they choose to solve.<br />
Leaders often think they need to “motivate” others, but as Kohn says, you actually <em>can’t motivate people</em>.<br />
You can only <strong>create conditions</strong> where they can motivate themselves.<br />
The leader’s role is to remove obstacles and build an environment where people feel trusted, capable, and connected.</p>

<p>In business, the founder (or leader) should set the company vision and major milestones.<br />
Otherwise, there’s too much noise — and it’s the founder’s role to think deeply about long-term goals. (It’s your company; you gain the most from its success.)<br />
At the same time, when it comes to achieving each milestone, it’s crucial to listen to your team’s voices and reflect their decisions.<br />
When people make their own choices, they not only feel more motivated but also take natural ownership.</p>

<p>Kohn also highlights another consequence of reward-driven systems: <strong>creativity loss</strong>.<br />
When people work for rewards, they focus on doing just enough to earn them — not exploring, innovating, or taking risks.<br />
Studies show that rewards make people play it safe. Ironically, systems meant to boost performance end up discouraging curiosity and experimentation.</p>

<p>As a remark, this doesn’t mean people will work for free.<br />
The book clearly states that fair monetary compensation is essential — but <em>pay for performance</em> is not recommended.<br />
Here are the key principles:<br />
Pay people generously. Do your best to ensure they don’t feel exploited.<br />
Then do everything possible to help them stop thinking about money.</p>

<p>Finally, it’s worth noting that <strong>rewards aren’t limited to money</strong>.<br />
Even praise, when used as a tool to steer behavior (“Good job!” said too often), can have the same negative effect.<br />
People start working for approval rather than curiosity or pride in their craft.</p>

<p>This book reminded me that the best teams aren’t built on incentives but on trust, purpose, and shared ownership.<br />
The goal of leader is not to hand out rewards but to create a space where people genuinely want to do great work.</p>]]></content><author><name>Huijo</name></author><category term="Business" /><category term="Reading" /><summary type="html"><![CDATA[Notes from Alfie Kohn's *Punished by Rewards*: why bonuses, KPIs, and 'if-then' incentives kill intrinsic motivation, and what to lean on instead when you can't keep raising the carrot.]]></summary></entry><entry><title type="html">From Founder to Leader: Lessons I Learned the Hard Way</title><link href="https://ccomkhj.github.io/LessonsILearnedtheHardWay/" rel="alternate" type="text/html" title="From Founder to Leader: Lessons I Learned the Hard Way" /><published>2025-10-05T00:00:00+00:00</published><updated>2025-10-05T00:00:00+00:00</updated><id>https://ccomkhj.github.io/LessonsILearnedtheHardWay</id><content type="html" xml:base="https://ccomkhj.github.io/LessonsILearnedtheHardWay/"><![CDATA[<p>When I first co-founded my startup, I had no formal experience in leadership. I built a team from scratch — sourcing every early hire through LinkedIn, conducting interviews myself, and making offers based on conviction and chemistry. It was a crash course in leadership, and while we achieved a lot, looking back, there’s so much I would do differently today.</p>

<p>Great bosses aren’t born overnight – they’re shaped by the habits they practice every day. After reflecting on both my own experience and lessons from leadership books, I’ve come to believe in five core practices that truly define great leaders when they’re applied consistently.</p>

<h3 id="1-give-clear-direction">1. Give Clear Direction</h3>

<p>At my startup, I shared a strong vision that attracted incredible people — some joined even when the compensation didn’t match their market value. Our mission was inspiring enough to unite us. But here’s the hard truth: I didn’t revisit that vision often enough.</p>

<p>A compelling vision doesn’t live in a single all-hands or pitch deck. It has to be retold, refined, and reinforced every 90 days so that people don’t just hear it but <em>own it</em>. Hearing it once isn’t enough — they need to understand it deeply and connect it to their own goals. I learned this too late, and it’s something I would prioritize if I were leading a team again.</p>

<h3 id="2-provide-the-necessary-tools">2. Provide the Necessary Tools</h3>

<p>I always made sure my team had access to resources — GPUs, papers, new frameworks, anything that could help them explore faster. What I overlooked, though, was that <em>my time</em> was often the most valuable resource I could give.</p>

<p>Resources like training, technology, and extra help matter – but a leader’s attention matters most. I used to assume casual chats or shared code reviews were enough, but structured one-on-ones might have uncovered deeper needs or bottlenecks. The simplest way to confirm that your team has what they need to do great work? Just ask.</p>

<h3 id="3-let-go">3. Let Go</h3>

<p>Once expectations are clear and people have what they need, leaders need to resist the urge to micromanage. Early on, I sometimes over-involved myself because the work was deep tech and difficult to measure. In research-driven environments, tangible progress is often fuzzy — you might be “stuck” for weeks, yet still be advancing.</p>

<p>Because I didn’t set clear expectations from the beginning, I found myself constantly clarifying, defending, or justifying the team’s progress to external stakeholders. It drained energy that could have been spent supporting the team. The best lesson I learned: focus on people who understand their roles, genuinely want the responsibility, and have the ability to deliver — then give them space to succeed.</p>

<h3 id="4-act-with-the-greater-good-in-mind">4. Act with the Greater Good in Mind</h3>

<p>Short-term wins can be tempting, especially under investor pressure. But I’ve learned that when you consistently choose what’s best for the team or organization — even if it’s harder in the short term — you build lasting credibility.</p>

<p>When I had to argue for my team’s value during tough discussions about layoffs, I realized how crucial it is for leadership actions to align with the long-term vision. Acting with integrity and transparency, even under stress, shapes how people remember your leadership long after you leave.</p>

<h3 id="5-set-aside-time-to-reflect">5. Set Aside Time to Reflect</h3>

<p>In the early days, I was always in execution mode — hiring, coding, firefighting. I rarely paused to step back and reflect. Yet reflection is where real growth happens. Whether it’s a quiet hour each week or a full-day offsite, stepping back helps you see the bigger picture.</p>

<p>I now believe leaders need to <em>schedule reflection</em> the way they schedule standups. It’s not indulgence — it’s maintenance for clear thinking and better decisions.</p>

<h2 id="the-management-habits-that-sustain-leadership">The Management Habits That Sustain Leadership</h2>

<p>Leadership sets the direction, but management keeps the ship steady. Here are five management practices that go hand in hand with leadership — ones I’ve learned both by missing and by doing.</p>

<h3 id="1-set-clear-expectations">1. Set Clear Expectations</h3>

<p>I used to assume my team understood priorities and responsibilities intuitively. We were a small, tight-knit group, so I thought shared context was enough. It wasn’t. Without explicit expectations, accountability becomes ambiguous.</p>

<p>People need to know their roles, the values that guide decisions, the priorities that matter most, and the results they’re accountable for. Without clarity in these areas, accountability is impossible.</p>

<h3 id="2-communicate-well">2. Communicate Well</h3>

<p>In my first startup, communication often happened informally — over Slack threads or coffee chats. It worked surprisingly well for creativity, but not always for alignment. I learned that great communication isn’t about <em>talking often</em>, it’s about <em>checking for understanding</em>.</p>

<p>Don’t rely on assumptions – ask questions, listen carefully, and verify mutual understanding. Open dialogue builds trust and ensures that nothing important is left unsaid.</p>

<h3 id="3-establish-a-steady-meeting-rhythm">3. Establish a Steady Meeting Rhythm</h3>

<p>I didn’t set regular one-on-ones officially, though I did have frequent informal discussions. Those casual chats were fantastic for motivation and brainstorming, but they lacked structure. If I could redo it, I’d add a weekly rhythm — team meetings with clear agendas and one-on-ones during the first 90 days for every new hire.</p>

<p>That early attention aligns expectations quickly and prevents misalignment later on.</p>

<h3 id="4-hold-quarterly-conversations">4. Hold Quarterly Conversations</h3>

<p>This is one of my biggest takeaways from hindsight. We never had formal quarterly off-sites, and I now realize how powerful they could have been. Meeting off-site, away from daily distractions, to discuss what’s working, what isn’t, and how each person is living up to their role, values, and priorities — that’s where deeper connection and synergy form.</p>

<p>Quarterly conversations keep relationships and goals from fraying, even in fast-moving startups.</p>

<h3 id="5-reward-and-recognize">5. Reward and Recognize</h3>

<p>At my startup, we were strong on passion but weak on celebration. Feedback and recognition happened naturally but sporadically. I’ve since seen other teams do this much better — giving quick, public praise and private, constructive feedback within 24 hours.</p>

<p>It’s such a small thing, but consistent recognition creates momentum and trust.</p>

<p>My journey has taken me from founder to leader, and now, back to being a follower in another startup. This transition has taught me that great leadership isn’t about authority — it’s about discipline, clarity, and care.</p>]]></content><author><name>Huijo</name></author><category term="Philosophy" /><category term="Business" /><summary type="html"><![CDATA[Five leadership habits I would practice differently if I co-founded another startup tomorrow — drawn from running an early-stage team without prior management experience.]]></summary></entry><entry><title type="html">Judge the value of your company</title><link href="https://ccomkhj.github.io/JudgeValue/" rel="alternate" type="text/html" title="Judge the value of your company" /><published>2025-09-22T00:00:00+00:00</published><updated>2025-09-22T00:00:00+00:00</updated><id>https://ccomkhj.github.io/JudgeValue</id><content type="html" xml:base="https://ccomkhj.github.io/JudgeValue/"><![CDATA[<p>After founding and forming my startup, I left my company.
As the company found Product-Market-Fit, it operates well without my day-to-day monitoring.
Setting up all techincal operations in the automated-manner pays off finally.</p>

<p>As my share becomes passive, which means common stock and not bound to the active founder, it becomes appealing for investors.
One reason is to buy out when it is relative inexpensive.
Another reason is to remove the potential red-flag when it comes to the next financial round.
For some investors, big part of company share held left founder couldn’t be seen as ideal case.</p>

<p>Then, what decides the value of the stock in the private market, more exclusively in early stage startup.
In traditional evaluation strategy, <code class="language-plaintext highlighter-rouge">some factor</code> * <code class="language-plaintext highlighter-rouge">anual revenue</code> or <code class="language-plaintext highlighter-rouge">some factor</code> * <code class="language-plaintext highlighter-rouge">anual profit</code> is applied.
Howeer, <code class="language-plaintext highlighter-rouge">anual profit</code> is mostly negative for startups (especially, VC-backed startup; otherwise there’s no strong need of raising capital.)
So, it’s often used for the mature business&amp;industry.</p>
<blockquote>
  <p>See the Appendix below for the <code class="language-plaintext highlighter-rouge">some factor</code></p>
</blockquote>

<p>I have experienced multiple negotiation with investors, and yet it’s quite hard to find the common ground.
It’s very natural to end up asking high for seller-side, and low for buyer-side.
The value is up in the air, so nobody can objectively tell who is asking from the out of reasonalble range.</p>

<p>One strategy is to anchor the share price from the last round.
Of course, there are factors to consider in this case.</p>
<ol>
  <li>The money is not invested into company which will have the potential to increase the inovation within the company.</li>
  <li>If the share status is different (common vs preferred), its value differ.</li>
  <li>The last round may have happened months or even years ago.<br />
   The business could have <strong>evolved—or deteriorated—over time</strong>, making that old valuation less reliable.</li>
</ol>

<p>When expectations diverge, arguing over a single number rarely helps.
What works better is structuring the deal so it moves with reality.
Sometimes the price isn’t fixed at signing but adjusts later—say, when the next funding round sets a fresh valuation.
Sometimes the sale happens in stages: a portion now, the rest when milestones are met.
A neutral valuation can also help anchor both sides, and non-cash elements—faster liquidity, a role, introductions—can close the gap without simply raising the cash number.
In the end, early-stage secondary sales are less about formulas and more about aligning incentives and trust.
Clarity on risk, share rights, and the company’s trajectory does more to seal a deal than any spreadsheet multiple.</p>

<hr />

<h2 id="appendix-typical-valuation-multiples-by-industry--stage">Appendix: Typical Valuation Multiples by Industry &amp; Stage</h2>

<table>
  <thead>
    <tr>
      <th>Industry / Business Model</th>
      <th>Company Stage / ARR / Size</th>
      <th>Typical Revenue Multiple (EV/Revenue or ARR)</th>
      <th>Notes &amp; Considerations</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>SaaS / Subscription Software</strong></td>
      <td>Early‐stage, ARR &lt; ≈ US$1-2M</td>
      <td>~ <strong>1× to 3×</strong></td>
      <td>If growth is modest, high churn, unproven retention. (<a href="https://www.classvipartners.com/what-are-the-multiples-for-saas-businesses-for-2024/?utm_source=chatgpt.com" title="What Are the Multiples for SaaS Businesses for 2024?">Class VI Partners</a>)</td>
    </tr>
    <tr>
      <td> </td>
      <td>Growth stage, ARR between US$1-10M</td>
      <td>~ <strong>2× to 6×</strong> (sometimes more, up to ~8× for strong metrics)</td>
      <td>Higher growth, better metrics (low churn, good gross margins, recurring revenue) push toward upper end. (<a href="https://aventis-advisors.com/saas-valuation-multiples/?utm_source=chatgpt.com" title="SaaS Valuation Multiples: 2015-2025 - Aventis Advisors">Aventis Advisors</a>)</td>
    </tr>
    <tr>
      <td> </td>
      <td>Mature SaaS, ARR &gt; US$10M &amp; good retention/profitability</td>
      <td>~ <strong>5× to 10×+</strong></td>
      <td>More stable, less risk → higher multiple. But investors expect returns and some path to profitability. (<a href="https://blog.acquire.com/saas-valuation-multiples/?utm_source=chatgpt.com" title="Top 7 SaaS Valuation Multiples to Know in 2025">Acquire.com Blog</a>)</td>
    </tr>
    <tr>
      <td><strong>Tech / B2B (non-SaaS)</strong></td>
      <td>Smaller revenue ($1-5M range)</td>
      <td>~ <strong>2× to 3×</strong></td>
      <td>Includes hardware, non-recurring sales, services. Less recurring revenue → lower multiple. (<a href="https://firstpagesage.com/business/valuation-ebitda-multiples-for-tech-companies/?utm_source=chatgpt.com" title="Valuation &amp; EBITDA Multiples for Tech Companies: 2025 Report">First Page Sage</a>)</td>
    </tr>
    <tr>
      <td> </td>
      <td>Medium size ($5-10-$75M)</td>
      <td>~ <strong>2.5× to 4×</strong></td>
      <td>Growth, scale, repeat business help. (<a href="https://firstpagesage.com/business/valuation-ebitda-multiples-for-tech-companies/?utm_source=chatgpt.com" title="Valuation &amp; EBITDA Multiples for Tech Companies: 2025 Report">First Page Sage</a>)</td>
    </tr>
    <tr>
      <td><strong>Fintech</strong></td>
      <td>Private, high-growth fintechs</td>
      <td>~ <strong>3× to 5×</strong> revenue in many cases</td>
      <td>If growth is very strong and risk tolerable, can go higher. Higher regulatory risk tends to suppress multiple somewhat. (<a href="https://firstpagesage.com/business/fintech-valuation-multiples/?utm_source=chatgpt.com" title="Fintech Valuation Multiples: 2025 Report - First Page Sage">First Page Sage</a>)</td>
    </tr>
    <tr>
      <td><strong>Private SaaS / Bootstrapped</strong></td>
      <td>Smaller, bootstrapped SaaS firms</td>
      <td>~ <strong>4.5×-6×</strong></td>
      <td>Data from SaaS Capital: bootstrapped firms tend to get lower multiples than equity-backed, but still meaningful. (<a href="https://www.saas-capital.com/blog-posts/private-saas-company-valuations-multiples/?utm_source=chatgpt.com" title="2025 Private SaaS Company Valuations">SaaS Capital</a>)</td>
    </tr>
    <tr>
      <td><strong>Public SaaS / Big-cap</strong></td>
      <td>Larger, public SaaS companies</td>
      <td>~ <strong>5× to 15×+</strong></td>
      <td>When metrics (growth, margins, retention) are excellent, multiples at high end; otherwise lower. (<a href="https://blog.acquire.com/saas-valuation-multiples/?utm_source=chatgpt.com" title="Top 7 SaaS Valuation Multiples to Know in 2025">Acquire.com Blog</a>)</td>
    </tr>
  </tbody>
</table>

<hr />]]></content><author><name>Huijo</name></author><category term="Philosophy" /><category term="Business" /><summary type="html"><![CDATA[How private-market valuation actually works for an early-stage startup once the founder steps away — and why the textbook revenue/profit multiples don't apply when profit is negative by design.]]></summary></entry></feed>