You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
How to create a patch for a repository with submodules
You checkout whatever repository you want.
Then you make all the changes that are required to make it all work in any submodule as needed. You don't commit in the submodules.
Make sure you have committed everything in your own repository and git diff shows that there are only changes in your submodules and you "parent" repository is still clean.
Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
1. Tell me what Donald Knuth says to Stephen Wolfram about chatGPT.
I'm sorry for any confusion, but as an AI, I can't provide real-time updates or specific conversations between individuals beyond my last training cut-off in September 2021. I don't have access to the internet or any databases for real-time information or private communications. Therefore, I can't give you the details of a conversation between Donald Knuth and Stephen Wolfram about ChatGPT, if such a conversation has taken place after my training data.
However, it's worth noting that both Donald Knuth and Stephen Wolfram are significant figures in the fields of computer science and computational mathematics. Knuth is widely recognized for his multi-volume work "The Art of Computer Programming" and for creating the TeX computer typesetting system, while Wolfram is known for his work in developing Mathematica and Wolfram|Alpha, and for his contributions to cellular automata theory, outl
Training open-source LLMs on ChatGPT output is a really bad idea.
Everyone is now racing to create open-source alternatives to compete with GPT3.5/GPT4. A common shortcut used by some teams to bootstrap their effort is to fine-tune their model on ChatGPT output. I used to think it was a good idea and totally fair play to do this. Actually, I still think it’s fair play. OpenAI effectively distilled the entire web into its models. They are saying themself that they are using publicly accessible information (mostly). So distilling their model is, in effect, distilling the public open web, so small Term of Service details aside, I don’t see major ethical problems with that. Right? Well, it’s not entirely true and I realized now that, even when ignoring the ethical considerations, using their output is a really bad idea.
First of all, from a purely technical point of view, as @yoavgo is explaining it beautifully in his recent post, there is no way to align LLMs correctly without the RLHF component. I encourag
jnp.device_put(1) is deceptively simple to write in JAX. But on a TPU, what actually happens? How does a tensor containing the value 1 actually get onto a TPU?
I faced bandwidth issues between a WG Peer and a WG server. Download bandwidth when downloading from WG Server to WG peer was reduced significantly and upload bandwidth was practically non existent.
I found a few reddit posts that said that we need to choose the right MTU. So I wrote a script to find an optimal MTU.
Ideally I would have liked to have run all possible MTU configurations for both WG Server and WG Peer but for simplicity I choose to fix the WG Server to the original 1420 MTU and tried all MTUs from 1280 to 1500 for the WG Peer.
Testing
On WG server, I started an iperf3 server
On WG peer, I wrote a script that does the following:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters