<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Representation-Learning on tl</title><link>https://blog.tklingard.com/tags/representation-learning/</link><description>Recent content in Representation-Learning on tl</description><generator>Hugo</generator><language>en</language><lastBuildDate>Tue, 30 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.tklingard.com/tags/representation-learning/index.xml" rel="self" type="application/rss+xml"/><item><title>You say Potato, I say 110</title><link>https://blog.tklingard.com/posts/2026-06/rdmreg-semantic-hashing-part-1/</link><pubDate>Tue, 30 Jun 2026 00:00:00 +0000</pubDate><guid>https://blog.tklingard.com/posts/2026-06/rdmreg-semantic-hashing-part-1/</guid><description>&lt;p&gt;I was re-reading the &lt;a href="https://arxiv.org/abs/2602.01456" class="external-link" target="_blank" rel="noopener"&gt;Rectified LpJEPA&lt;/a&gt; paper early this week and it got me thinking - this paper allows us to regularize model outputs to match a desired global distribution. What other embedding spaces might we care about?&lt;/p&gt;
&lt;p&gt;The first thing to come to mind was the concept of &lt;a href="https://www.cs.utoronto.ca/~rsalakhu/papers/semantic_final.pdf" class="external-link" target="_blank" rel="noopener"&gt;semantic hashing&lt;/a&gt;. This is in many ways the pre-cursor to RAG systems, and focuses on the generation of a series of bits associated with the semantic meaning of some input data, where an efficient bit-counting metric can be used (e.g. Hamming distance) to identify documents similar to a query. This use of a binary representation and an efficient metric should allow incredibly fast retrieval when combined with algorithms such as &lt;a href="https://pynndescent.readthedocs.io/en/latest/how_pynndescent_works.html" class="external-link" target="_blank" rel="noopener"&gt;PyNNDescent&lt;/a&gt; (or HNSW, I just love PyNNDescent&amp;rsquo;s explainer page).&lt;/p&gt;</description></item></channel></rss>