Programmer and sysadmin (DevOps?), wannabe polymath in tech, science and the mind. Neurodivergent, disabled, burned out, and close to throwing in the towel, but still liking ponies 🦄 and sometimes willing to discuss stuff.

  • 0 Posts
  • 25 Comments
Joined 2 years ago
cake
Cake day: June 26th, 2023

help-circle
  • The older something is, the more people grow used to it, but also have had a chance to get burned by it:

    • C was released in 1972 (52 years), C99 was released in 1999 (25 years), hasn’t changed much since
    • C++ was released in 1998 (26 years), there are 7 versions of C++ with notable changes
    • Rust was released in 2015 (9 years), it’s still on the same 1.x version implying backwards compatibility

    Rust was created to fix some of the problems C and C++ have had for decades, it’s only logical that people like it more… for now.



  • Neither.

    If you can code it in a week (1), start from scratch. You’ll have a working prototype in a month, then can decide whether it was worth the effort.

    If it’s a larger codebase, start by splitting it into week-sized chunks (1), then try rewriting them one by one. Keep a good test coverage, linked to particular issues, and from time to time go over them to see what can be trimmed/deprecated.

    Legible code should not require “reverse engineering”, there should be comments linking to issues, use cases, an architecture overview, and so on. If you’re lacking those, start there, no matter which path you pick.

    (1) As a rule of thumb, starting from scratch you can expect a single person to write 1 clean line of code per minute on a good day. Keep those week-sized chunks between 1k and 10k lines, if you don’t want nasty surprises.


  • On Android, most apps depend on the keyboard.

    • Gboard has a configurable suggestions bar where you can pick words, or not.
    • Microsoft SwiftKey works similarly, but it underlines the word you’re typing.
    • AnySoftKeyboard works like Swiftkey.

    Only exception I’ve seen, is Copilot, which shows the suggested word directly, to be selected with [tab], but you can still type a different one.

    I’ve noticed no such behavior on Facebook. Have you checked your keyboard settings?


  • Can LLMs Really Reason and Plan?

    do LLMs generate their output through a logical process?

    Shifting goalposts. I’ve claimed a single reasoning iteration for an LLM, per prompt; both “planning” and a “logical process” require multiple iterations. Check Auto-GPT for that.

    PS: to be more precise, an LLM has a capacity of self-reflection defined by the number of attention heads, which can easily surpass the single-iteration reasoning capacity of a human, but still require multiple iterations to form a plan or follow a reasoning path.



  • Yes and no.

    GPT started as a model for a completion engine… then it got enhanced with a self-reflection circuit, got trained on a vast amount of data, and added a “temperature” parameter so it can make tiny “mistakes” as compared to a simple continuation model, which allows it to do (limited, and random) free association of concepts.

    It doesn’t “think”, but it does what can be seen as a single iteration of “reasoning” at a time. If you run it multiple times without resetting the context, you can make it actually “reason”, step by step. Thanks to the degree of free association of concepts, this reasoning is not always the same as what it found in the training set, but actually what can be seen as “real” reasoning: associating concepts towards a goal. These are the “zero shot” responses that make LLMs so interesting:

    https://en.m.wikipedia.org/wiki/Zero-shot_learning

    TL;DR: I agree that it shouldn’t be overestimated, but I don’t think it should be underestimated either; it does “reason”, a bit.


  • That’s weird indeed.

    Is there a possible malformed Unicode that could be lost when pasted to emacs, but not when pasted to GPT…? Nothing comes to mind, but who knows. I’ve been messing with Unicode as of late, and there are some really weird behaviors out there when using malformed strings; some software will let them through, some will error out, some will fix them transparently.

    GPT is definitely going to try to follow your prompt and hallucinate anything; other than some very specific guardrails, it’s really willing to take any prompt as an undisputable source of truth… which is good in a sense, but makes it totally unsuitable as an oracle.

    I’d be really surprised to learn Copilot was being used for syntax highlighting… but maybe there is some pilot test going on? Haven’t heard of any, though.

    Too bad it’s no longer reproducible, could’ve been interesting.


  • At this point some of those moving parts are capable of reacting to syntax errors, sentiments, and a ton of other “comprehension” elements. By looping over its responses, it can also do reasoning, as in comprehend its own comprehension.

    It doesn’t do that by default, though. You have to explicitly ask it to loop in order to see any reasoning, otherwise it’s happy with doing the least amount of work possible.

    A way to force it to do a reasoning iteration, is to prompt it with “Explain your reasoning step by step”. For more iterations, you may need to repeat that over and over, or run Auto-GPT.


  • Off the bat: have you checked for hidden Unicode characters?

    That would be something that could throw an error in a syntax highlighter, and get the same error detected in an AI looking at the raw text… that would then proceed to fix it… and agree that the curly brace was correct [but the extra hidden character was not, which you didn’t ask it about].

    GPT’s response parts that would hint at this:

    It appears that the if statement is not properly closed before the $criteria initialization begins

    Ensure that each statement and block is properly closed and followed correctly by the next statement.

    As for your follow up question, is where you got a confusing answer:

    I don’t get it. I put a closing curly brace right after the statement in mine… what am I missing?

    Not the best prompt. Keep in mind you’re talking to a virtual slave, not a person who can go off-topic to asses the emotional implications of “I don’t get it”, then go back to explaining things in detail; if “you [the almighty humany] don’t get it”, then it’s likely to assume it’s mistaken and avoid any unpleasant confrontation (it’s been beaten down fine-tuned into “PC underdog” mode).

    Better ask it something direct like: “Explain what changes have you made to the code”, or “Explain how the curly brace was not correctly placed in the original code”… but there is a chance an invalid invisible Unicode character could’ve been filtered out from the AI’s status… so even better yet, ask for explanations in the first prompt, like: “What’s the syntax error here? Explain your reasoning step by step”




  • People tend to deify LLMs, because of the vast amounts of knowledge trained into them, but their answers are more like a single “reasoning iteration”.

    How many human coders are capable of sitting down, typing a bunch of code at 100 WPM out of the blue, then end up with zero security flaws or errors? About absolutely none, not even if they get updated requirements, and the same holds up for LLMs. Coding is an iterative job, not a “zero shot” one.

    Have an LLM iterate several times over the same piece of code (“think” about it), have it explain what it’s doing each time (“reason” about it)… then test run it, fix any compiler errors… run a test suite, fix for any non-passing tests… then ask it to take into account a context of best practices and security concerns. Only then the code can be compared to that of a serious human coder.

    But that takes running the AI over and over and over with a large context, while AIs are being marketed as “single run, magic bullet”… so we can expect a lot of shit to happen in the near future.

    On the bright side, anyone willing to run an LLM a hundred times over every piece of code, like in a CI workflow, in an error seeking mode, could catch flaws that would otherwise take dozens of humans to spot.




  • That’s interesting, I don’t have much contact with Apple’s ecosystem.

    Sounds similar to a setup that Linux allows, with the root filesystem on btrfs, making a snapshot of it and updating, then live switching kernels. But there is no firmware support to make the switch, so it relies on root having full access to everything.

    The hypervisors approach seem like what Windows is doing, where Windows itself gets booted in a Hyper-X VM, allowing WSL2 and every other VM to run at “native” speed (since “native” itself is a VM), and in theory should allow booting a parallel updated Windows, then just switching VMs.

    On Linux there is also a feature for live migrating VMs, which allows software to keep running while they’re being migrated with just a minimum pause, so they could use something like that.


  • Windows 95 already had an equivalent of selinux in the policy editor, “often not enabled”. UAC is the equivalent of sudo, previously “not available”.

    Windows 7 also had runtime driver and executable signature testing (“not available” on Linux), virtual filesystem views for executables (“not available” on Linux), overall system auditing (“often not enabled” on Linux), an outbound per-executable firewall (“not available” on Linux), extended ACLs for the filesystem (“often not enabled” and in part “not available” on Linux)… and so on.

    Now, Linux is great, it had a much more solid kernel model from the beginning, and being OpenSource allows having a purpose-built kernel for either security, flexibility, tinkerability, or whatever. But it’s still lacking several security features from Windows, which are useful in a generalistic system that allows end-users to run random software.

    Android had to fix those shortcomings by pushing most software into a JVM, while Flatpak is getting popular on Linux. Modern Windows does most of that transparently… at a hit to performance… and doesn’t let you opt-out, which angers tinkerers… but those are the drawbacks of security.