2024-10-11

$2 H100s: How the GPU Rental Bubble Burst

The GPU market has experienced a significant price drop, with H100 GPUs decreasing from $8/hr to under $2/hr due to oversupply and changing demand dynamics.- Factors contributing to this shift include reserved compute resales, open model fine-tuning, and a reduction in new foundation model companies, making renting GPUs more favorable than purchasing.- The emergence of open-weight models and more affordable alternatives, such as AMD and Intel GPUs, is influencing the market, with a growing emphasis on AI inference and fine-tuning, supported by platforms like Featherless.AI offering cost-effective AI solutions.

Reactions

The GPU rental market has experienced a dramatic price drop for H100 GPUs, from $8/hr to $2/hr, due to an oversupply and decreased demand from new foundation model companies.
This price reduction has burst the GPU rental bubble, affecting investors who heavily invested in GPU infrastructure.
The article explores the potential for a more accessible AI landscape with cheaper compute options, though the long-term sustainability of these low prices and the future of AI infrastructure are uncertain.

Tesla Robotaxi

Reactions

Tesla recently showcased its Robotaxi, emphasizing a vision for autonomous taxis that contrasts with Waymo's approach, which uses costly hardware like LiDAR.- The Robotaxi's design, which lacks a steering wheel, indicates a future dependent on full autonomy, though it faces regulatory and technological challenges.- Tesla's Full Self-Driving (FSD) technology is a topic of debate, with critics questioning its readiness for unsupervised driving and supporters optimistic about its potential.

Begin disabling installed extensions still using Manifest V2 in Chrome stable

Google is phasing out Manifest V2 for Chrome extensions, with warnings and disabling of these extensions starting on pre-stable channels as of October 9, 2024.
Users are encouraged to transition to Manifest V3 alternatives, with enterprises having until June 2025 to complete the transition using the ExtensionManifestV2Availability policy.
The phase-out process began on June 3, 2024, and the Chrome Web Store has not accepted new Manifest V2 extensions since June 2022 for private and January 2022 for public or unlisted extensions.

Reactions

Chrome is transitioning from Manifest V2 to Manifest V3 extensions, affecting ad blockers such as uBlock Origin by limiting their capabilities.- While Chrome is making this shift, browsers like Firefox, Vivaldi, and Brave intend to continue supporting Manifest V2 for the time being.- This change has prompted discussions on user control and privacy, with some users contemplating switching to alternative browsers to retain effective ad-blocking features.

A Lisp compiler to RISC-V written in Lisp

uLisp is a version of the Lisp programming language designed for microcontrollers, supporting platforms like Arduino, Raspberry Pi, and ESP32.- It includes features such as debugging, SD card interface, and I2C/SPI serial interfaces, with examples for applications like LED blinking and data logging.- A significant feature is the Lisp compiler for RISC-V, which compiles Lisp functions into machine code, supporting recursive functions and tail-call optimization for improved performance.

Reactions

A Lisp compiler for RISC-V, written in Lisp, is under development but lacks certain operations and functions to be self-compiling.- The compiler supports basic Lisp functions like car and cdr, but is not yet complete.- uLisp is highlighted for its simplicity and suitability for microcontrollers, with RISC-V being an attractive platform for tech enthusiasts and hackers.

Nobel Peace Prize for 2024 awarded to Nihon Hidankyo

The 2024 Nobel Peace Prize has been awarded to Nihon Hidankyo, a Japanese organization of atomic bomb survivors, known as Hibakusha, for their advocacy for a nuclear-free world.- The Hibakusha have significantly contributed to establishing the "nuclear taboo," a global norm against the use of nuclear weapons, through their impactful testimonies.- This recognition aligns with Alfred Nobel's vision of honoring efforts that benefit humanity and continues to inspire new generations towards nuclear disarmament.

Reactions

The 2024 Nobel Peace Prize was awarded to Nihon Hidankyo, a Japanese organization advocating against nuclear weapons, underscoring the persistent threat of nuclear arms amid global tensions.- This award serves as a reminder of the devastating impact of nuclear warfare, as exemplified by Hiroshima and Nagasaki, and stresses the importance of disarmament.- The prize discussion also involves the complexities of nuclear deterrence, international law, and the geopolitical dynamics among nuclear powers.

WordPress Alternatives

The article has been updated to include more Content Management System (CMS) alternatives due to increased interest, particularly in light of the current WordPress situation.- The list features downloadable CMS options like Ghost, Kirby, Indiekit, Craft CMS, ClassicPress, Statamic, Wagtail, and Textpattern, excluding API and git-based CMSs.- Notable mentions include Ghost for its built-in email features, Kirby for its file-based approach, and ClassicPress as a community-led WordPress fork, while some CMSs like Anchor are no longer maintained.

Reactions

Jekyll on GitHub Pages is recommended for simple blogs using Markdown, offering ease of use without requiring local setup and allowing content portability across platforms.
Alternatives to WordPress for blogging include Chyrp Lite, Typecho, Quartz, and Logseq, while Drupal, ProcessWire, and Wagtail provide more flexibility for developers.
Static site generators such as Astro and Publii are becoming increasingly popular, and for image hosting, options like S3+Cloudfront or CloudFlare are suggested.

Google Play killed my game and won't tell me why

Tukkun, an indie game developer, has been working on "Anti-Idle: Reborn," which was approved by Google and Apple, and has been in Closed Beta for a month.- On October 7, 2024, Google terminated Tukkun's account citing "prior violations" and "High Risk Behavior," but did not provide a clear explanation, impacting his work and income.- This situation highlights a broader issue where developers experience vague account terminations, prompting calls for more transparency and clarity from platforms like Google.

Reactions

Google Play removed a developer's game without explanation, underscoring the significant control tech companies have over developers.
Similar incidents have been reported with Amazon and Google, where accounts or apps are banned without clear reasons or adequate support.
Developers are encouraged to diversify their platforms to mitigate risks, as this situation highlights broader concerns about tech giants' customer service and the dependency risks of building businesses on their platforms.

Nurdle Patrol

Reactions

In 2023, 221 shipping containers were lost at sea, a minor number compared to the 250 million shipped annually, highlighting the scale of global shipping operations.
Plastic pellets, known as nurdles, are visible pollutants on beaches and can degrade into microplastics, entering the food chain and posing potential harm, though they are not the primary source of marine plastic pollution.
The discussion on plastic pollution emphasizes its complexity and global impact, including the export of waste issues from developed to developing countries and the potential adaptation of ecosystems to plastic pollution, raising concerns about future plastic use.

Initial CUDA Performance Lessons

Malte Skarupke discusses his experience learning CUDA, noting that it is essentially C++ with additional features for parallel computing.
Key lessons for optimizing CUDA performance include memory coalescing, understanding various memory types, and maximizing parallelism by using many threads and separating tasks into different kernels.
Skarupke emphasizes that writing CUDA is akin to solving a puzzle, where the primary focus should be on running tasks in parallel before optimizing for speed.

Reactions

The discussion focuses on optimizing CUDA code for GPU performance, specifically for an LHC (Large Hadron Collider) experiment trigger, by managing registers, shared memory, and thread blocks.
It emphasizes the trade-offs between occupancy (the number of active threads), register usage, and memory latencies, highlighting the evolution of programming constraints in CUDA.
The conversation compares GPU and CPU performance, noting differences in power consumption and computational capabilities, and stresses the importance of balancing occupancy and performance for future hardware and software advancements.

The FBI created a coin to investigate crypto pump-and-dump schemes

The FBI developed an Ethereum-based cryptocurrency, NexFundAI, to investigate and expose crypto pump-and-dump schemes, leading to significant legal actions.- Charges were filed against 18 individuals and entities for fraud and market manipulation, with the Securities and Exchange Commission targeting three market makers and nine others for inflating crypto asset prices.- The Department of Justice successfully recovered $25 million in fraudulent proceeds, which will be returned to investors, highlighting the operation's effectiveness in combating crypto fraud.

Reactions

The FBI developed a cryptocurrency to probe pump-and-dump schemes, which are fraudulent practices that artificially inflate the price of an asset before selling it off.
This initiative has ignited debates on entrapment and the ethical implications of law enforcement creating counterfeit securities.
The discussion extends to broader concerns about the legitimacy of cryptocurrencies and the government's role in regulating digital currencies.

NotesHub: cross-platform, Markdown-based note-taking app

The app is available across multiple platforms, including iOS, Android, Windows, Mac, Apple Vision Pro, and the Web, with the Web version being a free Progressive Web App that works offline.- Notes can be stored in Git repositories, with the best integration with GitHub, and also support self-hosted options like Gitea, file systems, or iCloud Drive.- The app supports rich Markdown syntax with extensions for creating Kanban boards, Excalidraw-based whiteboards, and includes features like Mermaid and ABC music notation.

Reactions

NotesHub is a versatile, Markdown-based note-taking app available across multiple platforms, including iOS, Android, Windows, Mac, Apple Vision Pro, and the Web.- The app offers a free Progressive Web App version, while native versions require a one-time payment, with strong integration for storing notes in Git repositories like GitHub, GitLab, or Bitbucket.- It features rich Markdown syntax, Kanban boards, and Excalidraw-based whiteboards, with users praising its clean design and offline capabilities, though it is not open-sourced and has limited Linux support.

Dead man's switch without reliance on your infra

A new Go project, Deadcheck, has been developed to function as a dead man's switch without relying on cron jobs, timers, or databases.- Deadcheck integrates with PagerDuty, a popular incident management platform, to keep incidents snoozed until a check-in is missed, at which point it triggers an alert.- This project is notable for its innovative approach to managing alerts and incidents without traditional scheduling or database dependencies.

Reactions

Deadcheck is a Go project designed as a dead man's switch, eliminating the need for cron jobs or databases, and integrates with PagerDuty to manage alerts.
The project has sparked discussions on dead man's switches, including legal aspects and alternative solutions such as using attorneys or blockchain systems.
Users have suggested existing services like Cronitor or OpsGenie for similar functionalities, and the project plans to expand integrations beyond PagerDuty.

Understanding the Limitations of Mathematical Reasoning in Large Language Models

The paper "GSM-Symbolic" by Iman Mirzadeh et al. investigates the mathematical reasoning capabilities of Large Language Models (LLMs) using the GSM8K benchmark.- The authors introduce GSM-Symbolic, a new benchmark with symbolic templates, showing that LLMs struggle with variations in numerical values and additional clauses in questions.- The study suggests that LLMs may replicate reasoning from training data rather than performing genuine logical reasoning, highlighting their limitations in mathematical reasoning.

Reactions

Large Language Models (LLMs) face challenges in mathematical reasoning, particularly when problems include irrelevant information, which affects their performance.- This limitation underscores LLMs' reliance on pattern recognition over logical reasoning, making them less effective in real-world scenarios with extraneous details.- Despite advancements, LLMs still struggle to distinguish important information from noise, a critical skill needed for practical applications.

ARIA: An Open Multimodal Native Mixture-of-Experts Model

Aria is an open multimodal native AI model that integrates diverse real-world information for a comprehensive understanding, surpassing models like Pixtral-12B and Llama3.2-11B in performance.- It is a mixture-of-expert model with 3.9 billion and 3.5 billion activated parameters per visual and text token, respectively, enhancing its language and multimodal capabilities.- The model's weights and codebase are open-sourced, facilitating easy adoption and adaptation by developers and researchers.

Reactions

ARIA is a new multimodal native Mixture-of-Experts (MoE) model that surpasses Pixtral-12B and Llama3.2-11B in performance and inference speed by efficiently utilizing active parameters.- Despite having memory usage similar to a 25B model, ARIA performs like a 10B model and operates as quickly as a 4B model, making it suitable for devices with adequate memory, such as an M2 Max.- The model's experts focus on syntax, with room for improvement in expert selection, and it is currently available for testing, although some users have encountered platform issues.