Skip to main content

2024-09-13

Notes on OpenAI's new o1 chain-of-thought models

  • OpenAI has released two new models, o1-preview and o1-mini, codenamed "strawberry," which offer improved reasoning capabilities through a chain of thought prompting pattern.
  • These models are reserved for tier 5 accounts ($1,000+ on API credits) and introduce "reasoning tokens" that are billed but not visible in the API response, sparking some dissatisfaction due to lack of transparency.
  • The new models can handle complex prompts better and have increased output token allowances, expanding the potential tasks solvable by large language models (LLMs).

Reactions

  • OpenAI's new o1 chain-of-thought models still produce hallucinations, such as non-existent libraries and functions, and often provide incorrect facts.
  • Users observe that while reasoning capabilities have improved, the models still fail to verify the factual accuracy of their outputs, necessitating user double-checking.
  • Some users liken the models to naive but intelligent interns, suggesting they can be useful with proper guidance, though they lack the ability to ask clarifying questions or admit uncertainty, impacting their reliability.

Data sleuths who spotted research misconduct cleared of defamation

  • A court has cleared the Data Colada researchers of defamation for identifying manipulated data in a Harvard Business School investigation.
  • Harvard will still face trial over its handling of the case, despite confirming misconduct by professor Francesca Gino, who is on administrative leave and may lose tenure.
  • The court ruled that evidence-backed conclusions are not defamation, fully clearing the Data Colada team due to their cautious, evidence-based approach.

Reactions

  • Data sleuths accused of defamation for identifying research misconduct have been cleared, with the case dismissed before discovery.
  • The court ruled that evidence-backed conclusions about fabricated data do not constitute defamation, supporting scientific integrity.
  • The defendants raised over $300k on GoFundMe for their legal defense, highlighting the high costs and emotional toll of defamation lawsuits in the US.

Boeing workers vote to strike

  • Tens of thousands of Boeing machinists voted overwhelmingly to strike after rejecting a contract offer, with 96% support from the International Association of Machinists and Aerospace Workers District 751.
  • The strike, which began outside Boeing’s Washington state plants, could cost the company an estimated $1 billion per week and disrupt its recovery from financial and safety challenges.
  • Despite a proposed 25% pay increase over four years and enhanced benefits, the deal fell short of other union demands; Boeing is ready to return to negotiations, and the Biden administration is monitoring the situation.

Reactions

  • Boeing workers have voted to strike, with 96% rejecting a proposed deal that included a significant pay increase.
  • The machinists' union is demanding better pay, improved working conditions, and for Boeing to "stop breaking the law."
  • The strike underscores broader dissatisfaction with Boeing's management, criticized for prioritizing profits over engineering quality and safety, contributing to issues like the 737 Max crashes.

FDA Authorizes First Over-the-Counter Hearing Aid Software

Reactions

  • The FDA has approved the first over-the-counter hearing aid software, enabling AirPods to be used as hearing aids, potentially reducing stigma and increasing accessibility.
  • This approval is anticipated to lower costs and encourage more individuals to address their hearing loss, though concerns about battery life and social perception persist.
  • Users have reported positive experiences with the accessibility features of hearing aids and AirPods, marking a significant step towards making hearing aids more affordable and accepted.

Entire staff of game publisher Annapurna Interactive has reportedly resigned

  • The entire staff of Annapurna Interactive, including former president Nathan Gary, has resigned after an unsuccessful attempt to spin off the company into an independent entity.
  • Annapurna Interactive's existing games and projects will remain under the company, with Hector Sanchez recently appointed as president of interactive and new media.
  • Annapurna plans to integrate its gaming operations with its film, TV, and theater divisions, continuing to publish games like Lorelei and the Laser Eyes and Open Roads, with upcoming titles such as Blade Runner 2033: Labyrinth.

Reactions

  • The entire staff of Annapurna Interactive, a game publisher, has resigned due to failed negotiations with their parent company, Annapurna Pictures, over financial integration.
  • The staff and executives preferred to spin off to maintain control over their creative direction, especially after the success of games like "Outer Wilds" and "Stray."
  • This mass resignation highlights the tension between creative independence and financial pressures within the gaming industry.

Does your startup need complex cloud infrastructure?

  • Pieter Levels advocates for simpler infrastructure, using single servers instead of complex cloud setups, to focus on product-market fit, as discussed on the Lex Friedman Podcast.
  • Two case studies highlight the pitfalls of overcomplicated setups: one with excessive Lambda functions and another with unnecessary microservices, both detracting from feature development.
  • Modern servers and tools like Docker Compose can provide powerful, manageable, and budget-friendly solutions, enabling small teams to focus on building great products rather than managing complex infrastructure.

Reactions

  • Startups often adopt complex cloud infrastructure like Kubernetes for scalability, but this can lead to poor quality and high costs due to immature team decisions.
  • Some experienced professionals argue that simpler, more reproducible setups using tools like Puppet and LTS (Long-Term Support) systems can be more efficient and cost-effective.
  • The debate highlights the trade-offs between modern cloud-native approaches and traditional, deterministic methods for managing infrastructure.

Porting SBCL to the Nintendo Switch

  • Charles Zhang and Shinmera have been working for two years to port the Trial game engine to the Nintendo Switch, focusing on adapting the Common Lisp runtime.
  • Despite successfully compiling and executing Lisp code on the Switch, unresolved issues include garbage collection and audio output, with the project costing around $17,000.
  • The Switch's ARM64 Cortex-A57 chip and OpenGL support made the port feasible, but challenges remain, such as interfacing with the Switch's proprietary OS and optimizing CLOS compilation.

Reactions

  • SBCL (Steel Bank Common Lisp) is being ported to the Nintendo Switch, which is significant for game development in Common Lisp due to its interactive code evaluation and rapid development cycles.
  • The project is led by Shinmera, who is handling the portability and build architecture, highlighting the technical challenges and potential benefits of running SBCL on specialized game hardware.
  • The use of the official Nintendo SDK (Software Development Kit) is necessary for publishing games on the Switch, as homebrew SDKs are not supported for retail console releases.

Who Owns Nebula?

  • Nebula is a video-on-demand streaming service focusing on educational content, built by content creators but not truly owned by them.
  • Standard Broadcast owns 83.125% of Nebula, CuriosityStream owns 16.875%, and creators directly own 0%, though they receive 50% of profits and proceeds from a sale.
  • Creators have "shadow equity," meaning they are compensated like owners without holding actual stock, raising questions about the platform's alignment with creators' values.

Reactions

  • Nebula is owned by Standard Broadcast LLC, with 44 creators having shadow equity instead of direct ownership to avoid logistical and tax issues.
  • If Nebula is sold, creators receive 50% of the proceeds, but some argue the structure lacks transparency and true cooperative ownership.
  • Critics claim the marketing is misleading since creators don't have direct equity or control over Nebula.

FlowTracker – Track data flowing through Java programs

  • FlowTracker is a Java agent designed to track data flow within Java programs, aiding in understanding the origin and significance of outputs.
  • It offers a video tutorial and a live demo for users to explore its functionalities.
  • More information and access to the tool can be found on its GitHub page: https://github.com/coekie/flowtracker.

Reactions

  • FlowTracker is a Java agent designed to track data flow in Java programs, aiding in understanding program outputs.
  • Users compare FlowTracker to tools like jitwatch and dynamic taint tracking, highlighting its potential for troubleshooting and data origin tracking.
  • The demo showcases its ability to trace an HTML element back to the SQL statement that added it to the database, generating excitement for its integration into various development environments.

Better-performing “25519” elliptic-curve cryptography

  • AWS has enhanced the performance and correctness of "25519" elliptic-curve cryptography in its open-source library, AWS LibCrypto (AWS-LC), through automated reasoning and CPU-specific optimizations.
  • These improvements, based on Google's BoringSSL, include significant performance gains for x25519 and Ed25519 algorithms on x86_64 and Arm64 CPUs, with Ed25519 signing operations seeing a 108% increase and x25519 operations improving by 113%.
  • The enhancements ensure constant-time execution to prevent side-channel attacks, with correctness verified by the s2n-bignum library and HOL Light theorem prover, making AWS-LC a robust choice for secure cryptographic implementations.

Reactions

  • Amazon's new "25519" elliptic-curve cryptography demonstrates significant performance improvements, particularly with an AVX512 optimized implementation by the Firedancer team outperforming OpenSSL.
  • The x25519 algorithm is used in TLS 1.3 and SSH hybrid schemes for post-quantum key agreement, highlighting its importance in modern cryptographic protocols.
  • Firedancer's codebase, known for blockchain optimization, is praised for its performance and secure programming practices, contributing to the broader adoption of ed25519 over RSA for SSH keys due to better performance, security, and compatibility.

Zero-Click Calendar invite – Critical zero-click vulnerability chain in macOS

  • A zero-click vulnerability in macOS Calendar allowed attackers to add or delete files within the Calendar sandbox, potentially leading to malicious code execution and compromising iCloud Photos data.
  • Apple fixed these vulnerabilities between October 2022 and September 2023, addressing issues like arbitrary file write/delete, remote code execution, and access to sensitive photos data.
  • The exploit chain involved multiple steps to bypass macOS security, including sandbox evasion, Gatekeeper bypass, and TCC protection circumvention, with fixes implemented in various macOS updates.

Reactions

  • A critical zero-click vulnerability in macOS allows attackers to send malicious calendar invites with file attachments, potentially stealing iCloud Photos without user interaction.
  • Users are questioning the security of such invites and suggesting whitelisting specific senders as a precaution.
  • Apple has been slow to pay bounties for these vulnerabilities, raising concerns about their commitment to user privacy and timely updates.

Notepat – Aesthetic Computer

Reactions

  • "Notepat" is a digital art project by Jeffrey Scudder, accessible via the website aesthetic.computer, featuring a retro computing environment and unique tools for creating digital art.
  • The project includes interactive elements like a 'notepat' app for music creation, with commands and a distinctive keyboard layout based on the chromatic scale.
  • Users can explore various features, including VR experiences like "Freaky Flowers," and the project has generated significant interest for its innovative and artistic approach to digital tools.

Meta fed its AI on everything adults have publicly posted since 2007

  • Meta has been using public posts and photos from Facebook and Instagram since 2007 to train its AI models, unless users set their posts to private.
  • European users can opt out of this data usage due to local privacy laws, but users in other regions, including Australia, do not have this option.
  • Meta has not provided clear details about the specifics of its data usage and collection timeline, raising privacy concerns among users.

Reactions

  • Meta has been using public posts from adults since 2007 to train its AI, sparking a debate on the ethics and legality of using public data for AI training.
  • Critics worry about creators' work being copied without consent, raising questions about fair use and copyright laws.
  • The discussion underscores the tension between technological progress and the protection of individual rights.

Greenland landslide caused freak wave that shook Earth for nine days

  • In August 2023, a landslide in Greenland's Dickson Fjord caused a 110-meter-high tsunami, creating a standing wave that lasted for nine days.
  • Seismologists initially identified the wave as an "unidentified seismic object" (USO) with a frequency of 11 millihertz, triggered by climate change-induced glacier thinning.
  • The fjord's unique shape and features trapped the wave's energy, highlighting the significant impact of climate change on Earth's geological phenomena.

Reactions

  • A Greenland landslide triggered a 110-meter high tsunami, initially noticed when a former employee saw an abandoned SIRIUS research station swept away after a cruise ship ran aground.
  • The tsunami, initially 7 meters high, was detected within a week due to the cruise incident, although seismic data would have eventually revealed it.
  • Seismological devices worldwide recorded the event, which lasted nine days, highlighting how random occurrences can lead to significant discoveries.

Wallops: A modern IRC client for classic Mac OS

  • Wallops, a modern IRC client for classic Mac OS, has released version 2.0, compatible with System 6 and newer versions, and includes significant updates and bug fixes.
  • Key features include a tabbed interface for multiple connections, channels, and private messages, window resizing, and optimized nick list sorting for large channels.
  • Wallops 2.0 also introduces new commands, improved interface elements, and performance enhancements, making it a robust tool for IRC users on classic Mac systems.

Reactions

  • Wallops is a modern IRC (Internet Relay Chat) client designed for classic Mac OS, generating interest among enthusiasts of vintage computing.
  • The release has sparked excitement due to the rarity of new software for old systems, with users reminiscing about their experiences with classic Macs.
  • Some users have noted improvements in Mac emulation, suggesting tools like MAME (Multiple Arcade Machine Emulator) for those without functioning vintage hardware.