A Map for Studying Pre-training in LLMs
- Data Collection
- General Text Data
- Specialized Data
- Data Preprocessing
- Quality Filtering
- Deduplication
//国名と取得するカレンダーURL | |
const calendarInfo = { | |
"Japan" : "ja.japanese#holiday@group.v.calendar.google.com", | |
"Taiwan" : "en.taiwan.official#holiday@group.v.calendar.google.com", | |
"US" : "en.usa.official#holiday@group.v.calendar.google.com", | |
}; | |
//この関数をトリガーに設定する | |
function callTrigger() | |
{ |
void main() { | |
final email1 = Email.maybeFrom("frank@moreno"); | |
final email2 = Email.maybeFrom("frank@moreno.com"); | |
print(email1); | |
print(email2); | |
} | |
bool isValidEmail(String email) { | |
// Regular expression to validate the format of an email |
#!/usr/bin/env python3 | |
import minimalmodbus | |
import serial | |
powerMeter = minimalmodbus.Instrument('/dev/ttyUSB0', 1) | |
powerMeter.serial.baudrate = 9600 | |
powerMeter.serial.bytesize = 8 | |
powerMeter.serial.parity = serial.PARITY_NONE | |
powerMeter.serial.stopbits = 1 | |
powerMeter.mode = minimalmodbus.MODE_RTU |
#!/usr/bin/env python | |
# coding: utf-8 | |
# You need PIL <http://www.pythonware.com/products/pil/> to run this script | |
# Download unifont.ttf from <http://unifoundry.com/unifont.html> (or use | |
# any TTF you have) | |
# Copyright 2011 Álvaro Justen [alvarojusten at gmail dot com] | |
# License: GPL <http://www.gnu.org/copyleft/gpl.html> | |
from image_utils import ImageText |
A lot of these are outright stolen from Edward O'Campo-Gooding's list of questions. I really like his list.
I'm having some trouble paring this down to a manageable list of questions -- I realistically want to know all of these things before starting to work at a company, but it's a lot to ask all at once. My current game plan is to pick 6 before an interview and ask those.
I'd love comments and suggestions about any of these.
I've found questions like "do you have smart people? Can I learn a lot at your company?" to be basically totally useless -- everybody will say "yeah, definitely!" and it's hard to learn anything from them. So I'm trying to make all of these questions pretty concrete -- if a team doesn't have an issue tracker, they don't have an issue tracker.
I'm also mostly not asking about principles, but the way things are -- not "do you think code review is important?", but "Does all code get reviewed?".
/* compile with: | |
on linux: gcc -g stack_traces.c | |
on OS X: gcc -g -fno-pie stack_traces.c | |
on windows: gcc -g stack_traces.c -limagehlp | |
*/ | |
#include <signal.h> | |
#include <stdio.h> | |
#include <assert.h> |
<?php | |
namespace App\Services; | |
use Illuminate\Http\Client\PendingRequest; | |
use Illuminate\Support\Facades\Auth; | |
use Illuminate\Support\Facades\Http; | |
class LemonSqueezyService | |
{ |
Install VMWare Workstation PRO 17 (Read it right. PRO!) | |
Also, these keys might also work with VMWare Fusion 13 PRO. Just tested it. | |
Sub to me on youtube pls - PurpleVibe32 | |
if you want more keys - call my bot on telegram. @purector_bot (THE BOT WONT REPLY ANYMORE) - Or: https://cdn.discordapp.com/attachments/1040615179894935645/1074016373228978277/keys.zip - the password in the zip is 102me. | |
--- | |
This gist can get off at any time. | |
PLEASE, DONT COPY THIS. IF YOU FORK IT, DONT EDIT IT. | |
*If you have a problem comment and people will try to help you! | |
*No virus |
Exhaustive list of SPDX (Software Package Data Exchange) licenses: https://spdx.org/licenses/