Bimillennial coding challenge

Last thread introduced a lot of bickering and anger, let's see if we can replicate that.

Rules
Commit your solution to the challenge in this thread. (Solutions written by women and solutions written in Rust receive a 10% bonus when being rated)

Challenge
The major element in a sequence with the length of L is the element which appears in a sequence more than L/2 times. The challenge is to find that element in a sequence.
Input sample:

92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,1992,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,924,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42

Your program should accept as its first argument a path to a filename. Each line of the file contains a sequence of integers N separated by comma. E.g.
Output sample:

1992None

For each sequence print out the major element or print "None" in case there is no such element. E.g.

Constraints:
N is in range [0, 100]
L is in range [10000, 30000]
The number of test cases

Other urls found in this thread:

codeeval.com/browse/132/
hastebin.com/ribenivili.pl
github.com/Lucy/tewi-font
github.com/rayon-rs/rayon/search?utf8=✓&q=env::var&type=
cgi101.com/book/ch3/text.html
youtube.com/watch?v=WaM47Otif04
youtube.com/watch?v=C9SiRNibD14
numba.pydata.org/
mega.nz/#!VPoAWRwQ!MNyWOflyP3HbfCrcCn0WRp008dQJPyixKgBAUphC240
twitter.com/davecheney/status/604837853344989184?lang=en
golang.org/pkg/bufio/#Scanner
spoj.com/submit/TEST/
pastebin.com/raw/S4Hc6tgw
codeeval.com/how_it_works/#ranking
codeeval.com/pricing/
twitter.com/NSFWRedditGif

I think there should be a week waiting time before anyone is allowed to reply so OP has to do his homework himself first.

1. sort list
2. begin looping through items starting at beginning of list
3. if the current element's value is equal to the value at the current index + L/2, return that element
4. keep going until that value is found, or reached L/2

Not my homework, nice try.


No pesudocode allowed.

op copied the problem from codeeval.com/browse/132/
he wants us to come up with an original answer for him to put on his profile

Done. It could be neater, and I've taken some liberties with the output so that you can't use it as homework, but it works.
Save as homework.perl and use it like so:
perl homework.perl "92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,19">19perl homework.perl "92,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,92">92perl homework.perl "4,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42">Noneperl homework.perl>Crashes because I'm too lazy to add an alternative.
hastebin.com/ribenivili.pl

...

Your 'challenge' gets a 110% penalty for this cancer, OP. Hang your head in shame.

OP gets the code he deserves in the language he deserves.
At least he would, but I don't know Rust.

I'm a black Jewish trans M2F otherkin who identifies as lesbian. I felt unsafe coding in Rust which is a toxic masculine name. Coded in Mathematica because it's proprietary.

majorElement[lst_] := SelectFirst[ Tally[lst], #[[2]] > Length[lst]/2 &] /. {Missing[_] -> "None", {x_, c_} -> x}

lmao @people thinking anyone would come to Holla Forums for homework
just post it on cuckoverflow where people routinely solve homework for some upboats

Use a dictionary

O(n), can't do it faster I believe.

It can be done in O(1) time you dumb nigger.

How? **You dumb nigger*

O(k) is asymptotically equivalent to O(1) for any constant k you double nigger.

O(L) then. Happy faggot? Learn standard notation, n is the length of the sequence

Please give me a value of n where O(n) wouldn't be a constant.

n=a variable=the length of the sequence
You're insane. Now show me how to do it in constant time rather than linear, faggot

no you double faggot

Impossible. Only alternative to O(n) would be probabilistic (Monte Carlo) methods up to some desired confidence interval.

No, nigger, that's L.
It's trivial. Do you even know what constant time means?

jesus christ stop taking such obvious bait

this is retarded just create an array of 100 counters and record the largest one.
That's it, now do your fucking homework.

You don’t deserve those dubs.

I can do it in O(\alpha n):
Get a hash table where putting shit in each bin is O(\alpha)
Put every item there.
I assume the hash table has a "majorest element" and "amount of times it appears" that gets updated when you put the element in each bin (which stores how many times the element appeared).
Then take the majorest element of the hash table.

My solution is "faster" than because I don't do the nlog(n) bit. All you have to do is control \alpha and you're golden (most times \alpha won't go over a constant). Just have a good hash function that doesn't put everything in the same bin.

Right. That there is the hash function. Then my answer is O(L).

It's O(n) you dumb nigger.

...

function major(L) local M = {} local max_n = nil local max_count = 0 local length = 0 for n in L do M[n] = (M[n] or 0) + 1 length = length + 1 end for n,count in pairs(M) do if count > max_count then max_n = n max_count = count end end if max_count > length/2 then return max_n endendfor line in io.lines(arg[1]) do print(major(string.gmatch(line, "%d+")) or "None")end
Usage:
$ cat data92,19,19,76,19,21,19,85,19,19,19,94,19,19,22,67,83,19,19,54,59,1,19,1992,11,30,92,1,11,92,38,92,92,43,92,92,51,92,36,97,92,92,92,43,22,84,92,924,79,89,98,48,42,39,79,55,70,21,39,98,16,96,2,10,24,14,47,0,50,95,20,95,48,50,12,42$ luajit shit.lua data 1992None

...

Gotta go fast, faggots. Also, I identify as I woman.

#include #include #include #include #include #include #include #define MAX_L 30000#define SEPARATOR_LENGTH 1#define EOL_LENGTH 1#define MAX_N_LENGTH 1#define MAX_CASES 40#define MAX_LINE_LENGTH ((SEPARATOR_LENGTH * (MAX_L - 1)) + (MAX_N_LENGTH * MAX_L) + EOL_LENGTH)#define MAX_FILE_LENGTH MAX_LINE_LENGTH * MAX_CASESchar buffer[MAX_FILE_LENGTH + 1];uint_fast64_t counts[100];uint_fast64_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};inline uint_fast64_t parse_n(char * restrict * restrict c) { uint_fast64_t retval = 0; char *begin = *c; while(**c >= '0' && **c > 1; for(i = 0; i < 100; ++i) { if(counts[i] > half) { printf("%d\n", i); break; } } if(i == 100) printf("None\n"); }}

It could go faster with pthreads but it's just extra bulk for what would be a pretty boring split of a buffer across threads.

You need 101 bins, faggot.

No, I identify as a woman, I can get away with it.

I am so sorry. Will educate myself to be aware of my own privilege. I will learn from this teaching experience.

Perl beautiful language
perl -nE 'chomp;@a=split/,/;%c;$c{$_}++for@a;say((grep{$c{$_}>=~~@a/2}@a)[0]||None)' cocks
I'm not gay

Post the source not the binary, dumbass.

[code]
What is going on here?

I'm skipping the first initialization since it's guaranteed to start zeroed. Jumping midway into a loop on the first iteration is generally a cleaner way to deal with that.

Argh, I need to sleep, I swear that { wasn't there when I first looked.

Bump. Do something for once, Holla Forums.

Even 4/g/ is better at programming than Holla Forums lmao

import argparsefrom multiprocessing import Pooldef findMajors(sequence): sequence=sequence.split(',') count_list=[] for integer in sequence: count=1 for search in sequence: if integer==search: count+=1 count_list.append(count) major_index=[] index=0 for count in count_list: if count>len(sequence)/2: if index not in major_index: major_index.append(index) index+=1 major_integers=[] for major in major_index: if sequence[major] not in major_integers: major_integers.append(sequence[major]) return major_integers def worker_process(sequence): major_integers=findMajors(sequence) if major_integers==['']: print 'None' else: for major in major_integers: print majordef main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read().split('\n') except: print 'None' p=Pool(len(sequences)) p.map(worker_process,sequences)if __name__=="__main__": main()

g++ challenge.c -o a.outchallenge.c:21:46: error: expected ‘,’ or ‘...’ before ‘*’ token inline uint_fast64_t parse_n(char * restrict * restrict c) { ^challenge.c: In function ‘uint_fast64_t parse_n(char*)’:challenge.c:24:18: error: ‘c’ was not declared in this scope char *begin = *c; ^challenge.c: In function ‘int main(int, char**)’:challenge.c:54:21: error: cannot convert ‘char**’ to ‘char*’ for argument ‘1’ to ‘uint_fast64_t parse_n(char*)’ n = parse_n(&c);

kys

gcc challenge.cIn function `main':challenge.c:(.text+0x87): undefined reference to `parse_n'collect2: error: ld returned 1 exit status

remove the 'inline' from the start of parse_n's definition.

Why are you even on this board if you don't know shit about technology?


No retard.

Amazing, doubt anyone can compete with this


Broken and busted

hipster.

[[$]] major_search.$= list_count/2? ( $::printf{"%d\n" key}. $=0. ). ). $::printf"None\n". 0.).

...

#!/usr/bin/env python3import sysfrom collections import Counterdef sequences(filename): with open(filename, "r") as f: for line in f: yield [int(i) for i in line.split(",")]def major(sequence): if not sequence: return None element, count = Counter(sequence).most_common()[0] if count

lmao

I encourage retards to get by through stealing homework, they'll inevitably fail their exams.

gib terminal configuration

#!/bin/python2.7import argparseimport sqlite3from multiprocessing import Pooldef dowork(sequence): conn=sqlite3.connect(':memory:') cursor=conn.cursor() seq_list=sequence.split(',') target_length=(len(seq_list)/2) cursor.execute ('CREATE TABLE IF NOT EXISTS challenge (integers integer)') for integer in seq_list: cursor.execute('INSERT INTO challenge VALUES (?)',(int(integer),) ) conn.commit() cursor.execute('SELECT integers, COUNT(*) FROM challenge GROUP BY integers HAVING COUNT(*) > (?)',(target_length,)) fetch=cursor.fetchone() if fetch!=None: return fetch[0] else: return 'None' cursor.execute('DROP TABLE challenge') conn.commit() conn.close() def main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read().rstrip().split('\n') except: print 'None' p=Pool(8) majors=p.map(dowork,sequences) for major in majors: print majorif __name__=="__main__": main()

Am I the only Europoor CS student here? We barely see code, here, we mostly focus on the logical/mathematical aspect of things.

Which is why you'll be able to use ANY language when you graduate.

Wow, you're making me realize just how much I suck a programming OP.
I tried doing this in C and I'm struggling to parse the fucking input.
I guess this is what happens when you only do algorithms (which is what you should be learning in school).

Also here's a hint to the people in this thread.
If you're using a hash table or any kind of bucket you're probably doing it wrong.

Nope, "computer science" was considered a branch of mathematics when I went to school.

Europoor here, we did Java in the first year and Assembly, Haskell and Prolog in the second. Half of each year and the whole of year 3 got delegated to theory, and we picked our own programming languages to fuck around with. The problem is that we all picked different languages to use primarily, and there were a lot of arguments on Python2 vs 3, PHP vs Perl and Java vs C#.
Theory is great, but you have to get your hands dirty with code too.

use std::collections::HashMap;use std::fs::File;use std::io::{BufRead, BufReader};fn main() { let mut args = std::env::args_os().skip(1); assert_eq!(args.len(), 1, "exactly 1 arg you stupid nigger"); let file = File::open(args.next().unwrap()).expect("i can't open this file you retarded fag"); let reader = BufReader::new(file); let mut map = HashMap::new(); for line in reader .lines() .map(|r| r.expect("your fucking shit isnt utf8")) { let mut count = 0; for i in line.split(',') .map(|s| s.parse::().expect("hitler dindu nuffin")) { *map.entry(i).or_insert(0) += 1; count += 1; } if let Some((k, _)) = map.drain() .filter(|&(_, v)| v >= count / 2) .max_by_key(|&(_, v)| v) { println!("{}", k); } else { println!("None"); } }}

That's because Euro CS is dead. In the '90s, you used to be all code and would do class projects as an entire class led by the professor. It's why Euro CS grads were so good back then. Today, you have the standard post-modern Jewish CS education where you learn no CS and produce CS Grads.

#include #include #define L_MAX 30000struct element { int value; int n;};struct element_list { struct element elements[L_MAX]; size_t index; size_t max_index;};voidinitElementList(struct element_list *elist){ elist->index = 0; elist->max_index = 0;}intfindElementByValue(int value, struct element_list *elist){ int i; for (i = 0; i < elist->max_index; i++) { if (elist->elements[i].value == value) return i; } return -1;}voidaddValueToList(int value, struct element_list *elist) { int idx; if ((idx = findElementByValue(value, elist)) == -1) { elist->elements[elist->max_index].value = value; elist->elements[elist->max_index].n = 1; elist->max_index++; } else { elist->elements[idx].n++; }}intfindMajorElement(struct element_list *elist){ int i; int max = 0; int max_index = 0; if (elist->max_index == 0) return -1; for (i = 0; i < elist->max_index; i++) { if (elist->elements[i].n > max) { max = elist->elements[i].n; max_index = i; } } if (max > (elist->max_index / 2)) { return elist->elements[max_index].value; } return -1;}intmain (int argc, char *argv[]){ FILE *fp; char c; char buf[3]; size_t buf_idx = 0; struct element_list elist; int major; initElementList(&elist); if (argc == 1) { printf("No input file.\n"); } else { while (--argc > 0) { if (!(fp = fopen(*++argv, "r"))) { printf("Can not open file."); exit(1); } else { while ((c = getc(fp)) != EOF) { if ((c >= '0') && (c

0.2pajeet
I hate you

pic related


Failed

Xresources
URxvt.perl-ext:URxvt.perl-ext-common: selection-to-clipboard#define termfont xft:tewi:pixelsize=13:antialias=falseURxvt.font: termfontURxvt.boldFont: termfontURxvt.intensityStyles: falseURxvt.boldMode: falseURxvt.skipBuiltinGlyphs: trueURxvt.loginShell: trueURxvt*title: urxvtURxvt*termName: rxvt-unicodeURxvt*urgentOnBell: trueURxvt.letterSpacing: 1URxvt.borderless: trueURxvt.internalBorder: 20URxvt.scrollBar: falseURxvt.scrollBar_right: falseURxvt.scrollstyle: plainURxvt.jumpScroll: falseURxvt.scrollWithBuffer: trueURxvt.scrollTtyKeypress: trueURxvt.scrollTtyOutput: falseURxvt.buffered: trueURxvt.pointerBlank: trueURxvt.underlineURLs: trueURxvt.saveLines: 2048URxvt.url-select.launcher: /opt/waterfox/waterfox-binURxvt.url-select.underline: trueURxvt.iso14755: falseURxvt.iso14755_52: falseXft.antialias: trueXft.autohint: falseXft.dpi: 96Xft.hinting: trueXft.hitstyle: hintslightXft.lcdfilter: lcddefaultXft.rgba: rgbXcursor.size: 15URxvt.utf8: trueURxvt.locale: true! X colors.! Generated by 'wal'emacs*foreground: #BDBBD2emacs*background: #1D1C22URxvt*foreground: #BDBBD2XTerm*foreground: #BDBBD2UXTerm*foreground: #BDBBD2URxvt*background: [100]#1D1C22XTerm*background: #1D1C22UXTerm*background: #1D1C22URxvt*cursorColor: #BDBBD2XTerm*cursorColor: #BDBBD2UXTerm*cursorColor: #BDBBD2URxvt*borderColor: [100]#1D1C22! Colors 0-15.*.color0: #1D1C22*color0: #1D1C22*.color1: #588877*color1: #588877*.color2: #606F8B*color2: #606F8B*.color3: #D060AE*color3: #D060AE*.color4: #19A494*color4: #19A494*.color5: #609E9B*color5: #609E9B*.color6: #43DBD4*color6: #43DBD4*.color7: #BDBBD2*color7: #BDBBD2*.color8: #77767a*color8: #77767a*.color9: #588877*color9: #588877*.color10: #606F8B*color10: #606F8B*.color11: #D060AE*color11: #D060AE*.color12: #19A494*color12: #19A494*.color13: #609E9B*color13: #609E9B*.color14: #43DBD4*color14: #43DBD4*.color15: #BDBBD2*color15: #BDBBD2! Black color that will not be affected by bold highlighting.*.color66: #1D1C22*color66: #1D1C22! Rofi colors.rofi.color-window: #1D1C22, #1D1C22, #606F8Brofi.color-normal: #1D1C22, #BDBBD2, #1D1C22, #606F8B, #1D1C22rofi.color-active: #1D1C22, #BDBBD2, #1D1C22, #606F8B, #1D1C22rofi.color-urgent: #1D1C22, #588877, #1D1C22, #588877, #BDBBD2! Xclock colors.XClock*foreground: #BDBBD2XClock*background: #1D1C22XClock*majorColor: rgba:bd/bb/d2/ffXClock*minorColor: rgba:bd/bb/d2/ffXClock*hourColor: rgba:bd/bb/d2/ffXClock*minuteColor: rgba:bd/bb/d2/ffXClock*secondColor: rgba:bd/bb/d2/ff! Set depth to make transparency work.URxvt*depth: 32

font: github.com/Lucy/tewi-font

The same thing without the hashmap
use std::fs::File;use std::io::{BufRead, BufReader};use std::env::args_os;fn main() { let filename = args_os().skip(1).next().expect( "exactly 1 arg you stupid nigger", ); let reader = BufReader::new(File::open(filename).expect( "i can't open this file you retarded fag", )); for line in reader.lines().map( |r| r.expect("your fucking shit isnt utf8"), ) { let mut buf: [u32; 101] = [0; 101]; let mut cnt = 0; for n in line.split(',') { buf[n.parse::().expect("hitler dindu nuffin")] += 1; cnt += 1; } match buf.into_iter().position(|x| *x > cnt / 2) { Some(n) => println!("{}", n), _ => println!("None"), } }}

Now we need the Ada version.

expect
the English word "expect"
used for that

...

Alright, you win. 99 then.

101 my black friend.

I bet you were the brightest kid in the short bus.

Thank me for bothering to correct you.

...

So how does it perform?

explain

i wonder why this isn't faster than it is. it's not much faster than my straight python solution, isn't the sqlite3 module for python C? All the work here should be done by sqlite3 but it's still slow.

Databases are ridiculously fucking slow, user. And sqlite3 is slow for a database.

There's up to 30,000 numbers in each test right? You're probably spending a lot of time inserting each row individually. Look into doing a bulk insert, or possibly grouping those inserts into a single transaction. Of course this is just conjecture, and you should profile to see where time is being spent.

A script to create test data if anyone wants it.

#!/usr/bin/pythonfrom random import randomfor t in xrange(40): l = 10000 + int((30000 - 10000 + 1) * random()) bias = 10 * random() result = [] for i in xrange(l): n = int(random()**bias * (100 + 1)) result.append(n) print ",".join([str(x) for x in result])

i was _just_ about to write one myself. thx

same, already did too
import randomstring=''row_count=40integer_count=50000row_counter=0while(row_counter

doing a batch insert with executemany instead of one by one improved performance by 500%.

testing on 40 rows with 50k integers each:
10-11s with

2-2.4s with
#!/bin/python2.7import argparseimport sqlite3from multiprocessing import Pooldef dowork(sequence): conn=sqlite3.connect(':memory:') cursor=conn.cursor() seq_list=sequence.split(',') target_length=(len(seq_list)/2) seq_list_int=[] for integer_str in seq_list: seq_list_int.append( ( int(integer_str),) ) cursor.execute ('CREATE TABLE challenge (integers integer)') cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list_int) conn.commit() cursor.execute('''SELECT integers,max(rowid) FROM challenge GROUP BY integers HAVING COUNT(1) > (?) LIMIT 1''',(target_length,)) fetch=cursor.fetchone() if fetch!=None: return fetch[0] else: return 'None' cursor.execute('DROP TABLE challenge') conn.commit() conn.close() def main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read().rstrip().split('\n') except: print 'None' p=Pool(8) majors=p.map(dowork,sequences) for major in majors: print majorif __name__=="__main__": main()

wat

IMPORTANT: OP PLEASE SPECIFY
are you saying that there is only one element which appears more than L/2 times?

using list comprehensions to generate the tuple list being fed into the table is another 20% performance boost, now 1.8-2.0s instead of 2-2.4

def dowork(sequence): conn=sqlite3.connect(':memory:') cursor=conn.cursor() seq_list=sequence.split(',') seq_list=[(x,) for x in seq_list ] target_length=(len(seq_list)/2) cursor.execute ('CREATE TABLE challenge (integers integer)') cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list) conn.commit() cursor.execute('''SELECT integers,max(rowid) FROM challenge GROUP BY integers HAVING COUNT(1) > (?) LIMIT 1''',(target_length,)) fetch=cursor.fetchone() if fetch!=None: return fetch[0] else: return 'None' cursor.execute('DROP TABLE challenge') conn.commit() conn.close()

I thought i screwed up here
seq_list=[(x,) for x in seq_list ]
should have been
seq_list=[(int(x),) for x in seq_list ]

but apparently it doesn't matter, sqlite will do the conversion if it can, which is probably part of the 20% performance increase.

How can there be more than one, Pajeet?

cursor.execute('DROP TABLE challenge') conn.commit() conn.close()

Those lines are unreachable, aren't they?

ya that's another fuckup, that should go above
if fetch!=None:

i don't know how necessary it even is, python probably closes it and garbage collects it anyway, in theory.

i made the fastest version: extern crate memchr;extern crate memmap;extern crate rayon;use std::cmp::{min};use memchr::{memchr};use memmap::{Mmap};use rayon::prelude::*;const MAX_TEST_CASES: usize = 40;const MAX_N: usize = 100;const MIN_L: usize = 10000;const MAX_L: usize = 30000;const MIN_LINE_LEN: usize = MIN_L + MIN_L - 1;const MAX_LINE_LEN: usize = MAX_L * 3 + MAX_L - 1;const MAX_TESTS_LEN: usize = MAX_LINE_LEN * MAX_TEST_CASES + MAX_TEST_CASES;fn main() { let mut args = std::env::args_os().skip(1); assert_eq!(args.len(), 1); let mmap = Mmap::open_path(args.next().unwrap(), memmap::Protection::Read).unwrap(); let slice = unsafe { mmap.as_slice() }; for &r in do_it(slice).iter().filter(|&r| r.is_some()) { match r.unwrap() { Some(v) => { println!("{}", v); } None => { println!("None"); } } }}fn do_it(slice: &[u8]) -> [Option; MAX_TEST_CASES] { debug_assert!(slice.len() = MIN_LINE_LEN); let (r, rs) = results.split_first_mut().unwrap(); match memchr(b'\n', &slice[MIN_LINE_LEN..min(slice.len(), MAX_LINE_LEN + 1)]) { Some(i) => { let (sl, sr) = slice.split_at(MIN_LINE_LEN + i); rayon::join(|| process_line(sl, r), || split_line(&sr[1..], rs)); } None => { rayon::join(|| process_line(slice, r), || {}); } }}fn process_line(slice: &[u8], result: &mut Option) { debug_assert!(slice.len() >= MIN_LINE_LEN); debug_assert!(slice.len() = c / 2) { Some((i, c)) => { debug_assert!(c >= MIN_L); debug_assert!(c { *result = Some(None); } }}fn parse(b: &[u8]) -> usize { debug_assert!(b.iter().all(|&b| b >= b'0' && b { (b[0] - b'0') * 10 + (b[1] - b'0') } 3 => { 100 } _ => { unreachable!() } }) as usize}

sheeeeeeiiiiiit im retarded

that was in the back of my mind too until i thought about it for a second

We all have our moments of diversity.

How do you run this shit? I tried running it and got:
thread '' has overflowed its stackfatal runtime error: stack overflowAborted
Also, how does one even know what source is being pulled into these rust projects?
Downloading rayon-core v1.2.1 Downloading futures v0.1.16 Downloading coco v0.1.1 Downloading rand v0.3.17 Downloading lazy_static v0.2.9 Downloading num_cpus v1.7.0 Downloading scopeguard v0.3.3 Downloading either v1.3.0 Downloading memmap v0.5.2 Downloading libc v0.2.32 Downloading memchr v2.0.0
Has anyone ever looked at what these are? How many are from complete nobodies with no oversight and no maintenance? How do I get the exact source that the version installed corresponds to to see if they're evil?

holy fuck you stupid nigget increase you stack size limit

I have no idea how to make your shitty language work. If you need stupid amounts of ram or whatever, say what options to build with.

look into the target folder. cargo caches the crates there

great language steve

the Cargo.toml tells you exactly what gets pulled
it pulls the requested crates from crates.io

very low energy bait.

dohoho, we'll see about that.
I built your shit on an i7-6700k in --release mode and ran the binary directly rather than via cargo using test data from and compared it to mine in . If there's some other bullshit I need to do to make rust go fast, tell me. Otherwise,

Yours:
real 0m0.020suser 0m0.108ssys 0m0.016s

Mine:
real 0m0.009suser 0m0.008ssys 0m0.000s

You got beat by someone who identifies as a girl for purposes of determining score! Girl power!

And not just beat, fucking destroyed. Let's look closer at that. You used 8 threads (6700k) against a single-threaded program and still lost. Not only did you lose, you managed to waste 0.108s of CPU time for a 0.020s runtime. So in reality rather than an artificial test where your program would be competing for CPU time with other programs, it's more accurate to say you need 108ms of CPU time to produce a result vs 8ms of CPU time. You also used 3x the ram. gj Rust community!

Have you looked at any of it? I looked at what was being pulled in via rayon just now and they're unsafely trusting the environment. I can make any Rust program using that crate crash if I can get "RAYON_NUM_THREADS" into the environment. Did the Rustfags learn nothing from shellshock? Fucking christ, this is like a '90s tier exploit. Webdevs have no business writing systems code.

fugg XDDDD
my rust version is really way slower

Also, the funny thing is rayon disables checking for RAYON_LOG in release builds as they recognize trusting the environment is unsafe (I was hoping I could overwrite your /etc/passwd), but they neglected to do that for their other environment variables. Check log.rs and lib.rs. I wonder how much of the Rust ecosystem is this insecure, or outright malicious, since no one checks.

shit i really fucked up bigly. for some reason this shit doesnt even utilize all my cores.
well at least is is memory and thread safe XDDDDDDD

with the modified function from

where are these tests coming from?

What the fuck am I reading? If you can't trust the environment you have bigger problems than someone setting RAYON_NUM_THREADS to over 9000.

from some pajeet code challenge/practice website thing

stay mad kiddo

It's like you've already forgotten the lessons re-learned from shellshock. The environment should not be trusted. There are too many ways to get information into it, and it's led to hundreds of exploits over the years.

LOL. bash did nothing wrong. webdevs did

LOL. webdev confirmed

the same can be said about sql queries.

It's like you don't know Python.
#!/usr/bin/python3ii=__import__;i=ii("random");r,ri=i.random,i.randint;m=lambda lll:str(int(lll()));[getattr(__builtins__,getattr("","".join([chr(i) for i in sum([[j,j+5] for j in [[k*5+1,k*5] for k in [21]][0]],[])]))([chr(i) for i in [112,114,105,110,116]]))(",".join([getattr(m,"".join([j if chr(99)==j[0] else i for i,j in [[(['__']*3)[i],['wall', 'call', 'mall'][i]] for i in range(3)]]))(lambda:ri(0,100)) for j in range(ri(10000, 30000))])) for i in range(40)]
Sadly can't add bias here, though.

"Nope, must not be a security problem!"
You're a fucking moron, user. Please never work on anything important.

lol stop sperging out, webdev. it is not a programs fault if its environment is compromised. maybe webdevs should stop putting malicious stuff into the environment????

literally the only two environment variables rayon reads are RAYON_NUM_THREADS and RAYON_LOG: github.com/rayon-rs/rayon/search?utf8=✓&q=env::var&type=
rayon enables logging only in debug mode or if you enable a compiler flag. it isnt diabled because "muh dangerous env variables".

fucking kill yourself you fag.

Hello, eternal September. Enjoy re-learning what we learned the hard way. lol.

not an argument, lol. reading environment variables is totally ok and if there is shit in there that isnt supposed to be in there, it isnt my fault, but the webdevs who put it there.
if you dont a particular program using environment variables because you are a webdev and might have put shit into it, just clear them.

Quality defense-in-depth on display from the eternal September.

holy shit you fucking retard. you won. i wont read env variables anymore. i also wont use command line arguments, files, user input and all that good stuff. after all, if my programm does absolutely nothing than it cant do anything wrong, right?
in fact you advice made even rust obsolete, since accessing memory is insecure.

You type like a child.

amazing argument. you win 10 interwebz

#include #include #include #define N 101int main(int argc, char** argv) { uint_fast16_t count[N]; std::memset(count, 0, sizeof count); if (std::ifstream infile{argv[1], std::ios::binary}) { uint_fast16_t n = 0; uint_fast16_t L = 0; while (infile >> n) { ++count[n]; ++L; if (infile.peek() == ',') { infile.ignore(1); } else { L /= 2; bool found = false; for (size_t i = 0; i < N; ++i) { if (count[i] >= L) { std::cout

What about memory usage in either of those programs?

euro cs student here,
people graduating from cs in europoora can't even write a for loop

i didn't think about doing it that way at all, i'm going to write something similar in python and see how shitty the python performance doing the same type of thing is.

I mentioned the core size difference was 3x. The C version is about 3M (the size of the worst-case test file that it reserves space for). The rust version is about 9M but I have no idea how to reason about where the memory is getting used as I don't know rust. It might just be from all the stack segments for the threads.

...

they don't. CGI does. Any webserver that implements it correctly will do likewise. It's webdevs who are responsible for being three degrees removed from the people responsible for responsibly handling their environment variables, just like any other program input.


kill yourself. this was a completely retarded 'feature' that would've been a massive problem even if the web had not existed at all. How common is it for programs to cheap out and invoke a subshell when not strictly necessary? Especially when the command is safe on its face? Bash made that unsafe.

EATSHIT="I don't even remember how this useless 'feature' goes" /usr/bin/some_setuid_admin_bin
^ now any subshells also does an rm -rf as root. Nice fucking job.

ok. but does cgi write to RAYON_NUM_THREADS or RAYON_LOG?

How is any of this malicious?
cgi101.com/book/ch3/text.html

can you two autists go get your own thread where you can hatefuck each other all you want?

right after you post some code

Slightly changed to use more idiomatic C++
#include #include #include #include #include #define N 101int main(int argc, char** argv) { std::array count; std::fill(count.begin(), count.end(), 0); if (std::ifstream infile{argv[1], std::ios::binary}) { uint_fast16_t n = 0; uint_fast16_t L = 0; while (infile >> n) { ++count[n]; ++L; if (infile.peek() == ',') { infile.ignore(1); } else { L /= 2; auto loc = std::find_if(count.begin(), count.end(), [&](uint_fast16_t& i) { return i >= L; }); if (loc != std::end(count)) { std::cout

i really need to stop wasting time with python.

what do you mean?

i've spent way too long writing everything in python and it would take time now to get back into C. python is great for glue and all the libraries and all that but clearly the performance no matter what you do is going to be 100 times worse than C.

mfw even perl5 is faster than python

Since the rustfag is trying to beat my single-threaded code with threads, I'm going to swoop in here and 'improve' my entry. It is unbeatable without just micro-optimizing this code and resubmitting it. Bow down to your queen, and learn something from my superior technique. Time this with that site of yours, and it better have more than one goddamn core available after I spent time doing this.

#define _GNU_SOURCE#include #include #include #include #include #include #include #include #include #include #include #define MAX_WORKERS 64#define MAX_CASES 40static char *buffer;static uint_fast16_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};static long processors;static off_t size;static inline uint_fast8_t parse_n(char * restrict * restrict c) { uint_fast8_t retval = 0; char *begin = *c; while(**c >= '0' && **c buffer) { c = strchr(c - 1, '\n') + 1; } if(!*c) return NULL; worker->begin = c; char *end; if(processor < processors) end = buffer + (size * (processor + 1)) / processors; else end = buffer + size; goto start; for(; c < end; ++c) { (void) memset(worker->counts, 0, sizeof(worker->counts));start: for(l = 1; ; ++l, ++c) { n = parse_n(&c); ++worker->counts[n]; if(*c == '\n') break; } uint_fast16_t half = l >> 1; for(i = 0; i < sizeof(worker->counts) / sizeof(worker->counts[0]); ++i) { if(worker->counts[i] > half) { sprintf(worker->answer[worker->answers], "%d", i); break; } } if(i == sizeof(worker->counts) / sizeof(worker->counts[0])) sprintf(worker->answer[worker->answers], "None"); ++worker->answers; } return NULL;}int main(int argc, char *argv[]) { struct stat stat; int fd = open(argv[1], O_RDONLY); fstat(fd, &stat); size = stat.st_size; buffer = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0); errno = 0; processors = sysconf(_SC_NPROCESSORS_ONLN); if(processors > MAX_WORKERS) processors = MAX_WORKERS; for(long processor = 0; processor < processors; ++processor) { pthread_attr_t attr; pthread_attr_init(&attr); cpu_set_t cpus; CPU_ZERO(&cpus); CPU_SET(processor, &cpus); pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpus); pthread_create(&workers[processor].thread, &attr, &thread_start, (void *) processor); } for(long processor = 0; processor < processors; ++processor) { worker_t const *worker = &workers[processor]; pthread_join(worker->thread, NULL); } for(long processor = 0; processor < processors; ++processor) { worker_t const *worker = &workers[processor]; if(processor + 1 < processors && workers[processor].begin == workers[processor + 1].begin) continue; for(int i = 0; i < worker->answers; ++i) { printf("%s\n", worker->answer[i]); } } return 0;}

Thing is, if you look close, this is maximum faggotry. I would never write code like this normally. It is highly optimized to have an extremely good wall clock time (how these tests are scored) but will sacrifice CPU time to do it. In the real world where you have other shit running, that's pretty dumb. But winning is everything, so whatever. Some highlights:

- It gambles that the tests are being run in an unclean state and uses mmap() to swipe the data from the fs cache. mmap() is slower in the real world due to the cost of page faults.

- It divides the buffer across threads and has each thread find its starting position independent of the others rather than what you're taught in school: to make a list of the work (e.g., split the lines) then divide it amongst work queues. This is a big optimization for wall-clock time but it means there is overlap and some results might be computed and thrown away (wasted CPU time). Threads will have a less balanced workload with this technique, but it's a huge win for wall clock time not having to scan the entire file on a single thread, so it works out.

FWIW, I (your queen) heavily use Python, C, and C++. They're all great at their niche. Just don't use the wrong tool for the job.

/tmp/ccMmS4zt.o: In function `main':source.c:(.text+0x364): undefined reference to `pthread_attr_setaffinity_np'source.c:(.text+0x37c): undefined reference to `pthread_create'source.c:(.text+0x3b0): undefined reference to `pthread_join'collect2: error: ld returned 1 exit status

Sorry queen

-lpthread, newfag.

Can't influence switches on the testing site, sorry queer queen

That's pretty terrible. So it allows threads in C++ and Rust but locks out C? What a Pajeet website.

quit college. try making demands for my money back no matter how unlikely that is. drop hints about coming back to get a degree in philosophical proctology
sell all property
go to renfair
buy bastard sword
travel to physical locations of testing site's admins
"shall we cross steel, churl?"

I haven't checked if it supports C++ threads.
It doesn't support Rust at all lel

Since it locks it out, here's results from a 6700k using that test data from python,

real 0m0.001suser 0m0.004ssys 0m0.000s

1 MS

Compare with the single-threaded code I gave numbers for in ,

real 0m0.009suser 0m0.008ssys 0m0.000s

So I'd expect if that shitty site could run it, since it was at 3ms in , it might register as 0ms.

...

The site seems to value low memory usage more than cpu usage after a certain threshold for cpu time has been crossed.

i changed your single threaded version to also memory map the file and it is as fast as your multithreaded version

It needs a large test case to notice a difference. The script in should be enough. It scales really well so the larger the data gets the better it'll do.

lol no i tripled the number of tests and now the single threaded version is faster

You might have a shitty CPU, user. You can try limiting the MAX_WORKERS to half to see if you've got a bad implementation of hyperthreads.

Also, if you've got something burning CPU for no reason, I set affinity on each thread and the slowest one determines your runtime so that will create a bottleneck. You can comment out the affinity stuff and have the scheduler route around that, but it's less efficient.

i tried 32, 16, 8 and 4 workers. single threaded version is still better.

Queen,
what is your dayjob? Are you from NA or EU?

It points to a problem with your setup. The bottleneck code is essentially the same other than setting affinity to run on a core the scheduler might not have otherwise chosen. You should look into it further.

no my setup is fine. it is just that your multi threaded version is shit. modify the single threaded to use mmap and see for yourself

I do low-level networking on embedded devices and gamedev. Simultaneously. It's been a strange ride. I'm currently working 12 hours a day.

Also, in socal. And I've not slept yet. It's 8:30am..

i smell LARPing

i doubt a LARPer could produce the C source in this thread

are you also actually a girl, because that'd be hilarious

i mean it really isnt that hard. if you cant do that you should consider enrolling into a cs course

Modified to use mmap:
real 0m0.008suser 0m0.008ssys 0m0.000s
Again, your setup is shit. Like I said, there's no significant change in the bottleneck. It's purely going to be an issue with affinity that you can solve.

I had today off to deal with an immigration interview for my wife (she's not mail order, just Canadian).

no you fag. my setup is fine. your shitty multithreaded version just sucks.

...

No, I'm a 40 year old WHITE MALE.

k

Hi i'm tyrone, want me to take care of your wife?

B-b-but muh SAFE memory usage!!!! It's safe.

yes it is safe. when i wrote this piece of shit rust code i intended it go fast and didnt give a shit about memory usage. unfortunately i didnt benchmark this trashy codevomit before i proudly presented it to you.

No, you cannot assume that.

i deleted the last post because i accidentally posted the wrong fucked up code. this is the correct code, and time is down to 1.0-1.2s compared to 2.2s with

still obviously shit compared to these C solutions, it's 1.8s with 1 process.
the improvement here is get rid of half of the sequence immediately, since the number is guaranteed to be in 50%+1 of that sequence.

#!/bin/python2.7import argparseimport sqlite3from multiprocessing import Pooldef dowork(sequence): conn=sqlite3.connect(':memory:') conn.execute ('PRAGMA journal_mode = OFF') conn.execute ('PRAGMA synchronous = OFF') conn.execute ('PRAGMA locking_mode = EXCLUSIVE') cursor=conn.cursor() seq_list=sequence.split(',') target_length=(len(seq_list)/2) #drop half of the sequence, unnecessary seq_list=seq_list[:target_length] seq_list=[(x,) for x in seq_list ] cursor.execute ('CREATE TABLE challenge (integers integer)') cursor.executemany('INSERT INTO challenge VALUES (?)',seq_list) conn.commit() cursor.execute('''SELECT integers,COUNT(1) as integers FROM challenge GROUP BY integers ORDER BY COUNT(1) DESC LIMIT 1''') fetch=cursor.fetchone() cursor.execute('DROP TABLE challenge') conn.commit() conn.close() if fetch==None: return None elif fetch[1]>((target_length/2)-1): return fetch[0] else: return Nonedef main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read().rstrip().split('\n') except: print 'None' p=Pool(1) majors=p.map(dowork,sequences) for major in majors: print major dowork(sequences[0])if __name__=="__main__": main()

Stop saying that. How many ways are there to select HALF of the numbers?

i need a drink, still fucked up the post, delete
dowork(sequences[0]) in main(). that was just there for testing.

> seq_list=sequence.split(',')> target_length=(len(seq_list)/2)> #drop half of the sequence, unnecessary> seq_list=seq_list[:target_length]

that is a very happy looking nigger. what is he looking at?

You're in for a treat. Just watch the first minute.
youtube.com/watch?v=WaM47Otif04

youtube.com/watch?v=C9SiRNibD14

sergio, el scripto no werko

Noob here, why is the C++ solution more concise than the C one but still rather efficient? Isn't C++ shit?

because you're getting more done per line, duh. the parse_n in the cheater threads code would not even take up a single line in a scripting language; it'd be one subexpression of a line, like string.gmatch in the lua:
print(major(string.gmatch(line, "%d+")) or "None")
language shitness isn't just about code terseness, or we would also shill APL

But people here say C is the best and shit like C++ and Java are bloated heaps of trash.

i'm waiting for someone to post the x86 assembly solution.

again: is APL the best? There's more to language shitness than tersity. Tersity doesn't even reliably translate to humans spending less time on writing/reading/troubleshooting that span of code.

The maximum will occur in some half+1 length subsequence, but you made the false assumption that the subsequence necessarily starts at the beginning. Imagine the subsequence as a sliding window, and you'll see why that assumption is wrong.

See gif related.

Not wonder why you arent Gods chosen high priest

Holla Forums said C++ is shit so i refuse to believe it can be as good as gods language also known as C.

ok mang. You saw through my attempts at deflection. Here's the truth: C is God's language and C++ is complete shit. C++'s shitness does not extend to it not having a number of conveniences that show off in toy code like this. The devil's succubi do not appear as ugly hags, but beautiful maidens. Similarly C++ has superficial benefits that lure you into letting it waste tremendous amounts of brainspace on it. C++ is the devil's own attempt at a DOS attack on your brain. Though warlocks and sinners believe that they can deal with the devil, extract some narrow advantage ("I'll just use Qt for the GUI"), and then flee, they invariably fall to hell as well, and become mad and incapable of safe coding. Although some will claim to have successful careers, this is a combination of survivor's bias and the AIDS-patient mental weakness of such people not coming across in text.
Save yourself.

package mainimport ( "bufio" "fmt" "os" "strconv" "strings")func process(filename string) { f, err := os.Open(filename) if err != nil { panic(err) } defer f.Close() s := bufio.NewScanner(f) for s.Scan() { L := 0 hist := make([]int, 101) numStrs := strings.Split(s.Text(), ",") for _, numStr := range numStrs { num, err := strconv.Atoi(numStr) if err != nil { panic(err) } if num < 0 || num > 100 { panic("number out of range") } L++ hist[num]++ } func() { for num, count := range hist { if count > L/2 { fmt.Println(num) return } } fmt.Println("None") }() }}func main() { if len(os.Args) != 2 { fmt.Printf("usage: %s [input]\n", os.Args[0]) return } process(os.Args[1])}
Go
am grill

kek. Reminds me of my dance with the devil called Haskell. After reading a paper on how the type system can solve a rubiks cube, I really questioned what I was doing with my life.

...

^_^

could have brought memory usage down a bit by reusing hist instead of making a new one, but since Go binaries are usually several megabytes by themselves, there's no point.

Current powerranking:

1
C/C++ (the site actually gave the C++ one a higher rating even thogh it's somewhat slower)

2
Python

3
Go

?
[Rust] (no metrics exist for the rust program so not sure where it ranks)

because the site doesn't support it? what languages does it support then?

You have to use these languages I believe:


No Erlang, Ada, Rust, APL, J... total pleb tier.

The goal was to be fast, not concise. The C solutions could be made concise.

Ahh, just occurred to me that even this is wrong too. It's relying on the assumption of a continuous subsequence... and why should it be? The number of ways of selecting half+1 from even just 100 is astronomical. Trying to skip inputs is a pure dead end.

Jesus, this site.. What metric do you think it's weighing higher?

their signups are also broken. I never got the confirmation email. searched and it's a known problem.

#!/bin/bashdeclare -a numberswhile read linedo IFS=', ' read -r -a array ${#numbers[@]} / 2 )) then echo "$max" else echo "None" fi for i in "${!numbers[@]}" do numbers[$i]=0 donedone < "$1"

Is that fucking debian oldoldstable?

CPU time does matter but after a certain limit it doesn't net you any more points. Low memory gives a lot of points.

I wonder if it'd count the shared mmap version as the full file size even though it only actually needs 4k of ram. I'lll send you that version when I get home.

Step aside, Pajeet!

import Control.Arrowimport Data.Listimport Data.List.Splitimport Data.Maybeimport System.Environmentmain = getArgs >>= readFile . head >>= putStr . unlines . map majorElement . linesmajorElement = maybe "None" snd . listToMaybe . uncurry moreThanHalf . (length &&& count) . splitOn ","count = map (length &&& head) . group . sortmoreThanHalf l = filter ((> l `div` 2) . fst)

...

I don't have enough autism for haskell.

Imagine you have a machine, that when you feed a coin of type A, it outputs another machine that takes coins of type B. When you give that machine a B, it gives you a C.

What uncurry does, it wraps those details up, and says, if you give me an A and B right now, I'll give you a C.

Actually I missed one step at the end, it's a little more subtle. Uncurry is a machine that takes the machine I first described, and gives you a machine that takes an A and B at the same time. Of course, once you subsequently give that machine an A and B, you get a C.

fuck your right, this is solution isn't going to work at all. it only works on the test cases because the major numbers happen to be in > 50% of the first half of the sequence but it doesn't have to be. i thought you could drop every other number instead for a second or something to fix it but that's the exact same problem.

No worries, as put it, we all have these moments. The difference is being autistic enough to care in the first place in rigorously attacking the problem. A true Pajeet can shit in his code base like this and be proud of it.

ye olde sepples #include #include #include templateinline uint16_t count_each(std::string line, uint16_t(& count_of)[N]){ uint8_t curval = 0; uint16_t n_numbers = 1; for (size_t i = 0, sz = line.size(); i < sz; ++i){ char c = line[i]; if (c == ','){ ++count_of[curval]; curval = 0; ++n_numbers; continue; } curval = curval * 10 + (c - '0'); } ++count_of[curval]; return n_numbers;}templateinline size_t argL2(uint16_t(& count_of)[N], uint16_t l2){ for (size_t i = 0; i < N; ++i) if (count_of[i] > l2) return i; return N;}inline void major(const char* filename){ std::ifstream file(filename); for (std::string line;;){ getline(file, line); if (!file) return; uint16_t count_of[101] = {}; const uint16_t sz = count_each(std::move(line), count_of); const size_t res = argL2(count_of, sz / 2); if (res == sizeof(count_of) / sizeof(count_of[0])) std::cout

a simple for loop that reads integers into an array while ifsteam object != ‘,’ would be more succinct.

True, but I was more trying to limit the number of ifstream calls for efficiency.
Don't know how much more efficient it actually is, though :^)

This one should show up as low memory as the buffer is only necessarily 4k, but the site might be shit and count the whole map as memory. If it's fucked like that, I can change it in another way to get it to show low memory. This is just my single-threaded one using mmap() to see if it changes the ranking.

#define _GNU_SOURCE#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define MAX_L 30000#define SEPARATOR_LENGTH 1#define EOL_LENGTH 1#define MAX_N_LENGTH 1#define MAX_CASES 40#define MAX_LINE_LENGTH ((SEPARATOR_LENGTH * (MAX_L - 1)) + (MAX_N_LENGTH * MAX_L) + EOL_LENGTH)#define MAX_FILE_LENGTH MAX_LINE_LENGTH * MAX_CASESchar *buffer;uint_fast64_t counts[100];uint_fast64_t const tens[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};inline uint_fast64_t parse_n(char * restrict * restrict c) { uint_fast64_t retval = 0; char *begin = *c; while(**c >= '0' && **c > 1; for(i = 0; i < 100; ++i) { if(counts[i] > half) { printf("%d\n", i); break; } } if(i == 100) printf("None\n"); }}

Nice problem.

[spoiler]1. Use a linear time selection algorithm to find the median.
2. The median doesn't necessarily appear >L/2 times, so count the number of occurrences.
3. If it does then return the median, else return None.[/spoiler]

Exact same as this: , but multithreaded: #include #include #include #include #include templateinline uint16_t count_each(std::string line, uint16_t(& count_of)[N]){ uint8_t curval = 0; uint16_t n_numbers = 1; for (size_t i = 0, sz = line.size(); i < sz; ++i){ char c = line[i]; if (c == ','){ ++count_of[curval]; curval = 0; ++n_numbers; continue; } curval = curval * 10 + (c - '0'); } ++count_of[curval]; return n_numbers;}templateinline size_t argL2(uint16_t(& count_of)[N], uint16_t l2){ for (size_t i = 0; i < N; ++i) if (count_of[i] > l2) return i; return N;}inline void major(const char* filename){ std::vector lines; lines.reserve(40); std::ifstream file(filename); for (std::string line;;){ getline(file, line); if (!file) break; lines.push_back(std::move(line)); } size_t n_threads = std::thread::hardware_concurrency(); n_threads = n_threads == 0 ? 1 : n_threads; n_threads = n_threads > lines.size() ? lines.size() : n_threads; std::vector threads(n_threads); size_t interval = lines.size() / n_threads; for (size_t i = 0, start = 0; i < n_threads; ++i, start += interval){ threads[i] = std::thread([&lines, start, finish(i < n_threads - 1 ? start + interval : lines.size())]{ for (size_t i = start; i < finish; ++i){ uint16_t count_of[101] = {}; const uint16_t sz = count_each(std::move(lines[i]), count_of); size_t res = argL2(count_of, sz / 2); lines[i] = res == sizeof(count_of) / sizeof(count_of[0]) ? "None" : std::to_string(res); } }); } for (std::thread& thr : threads) thr.join(); for (const std::string& res : lines) std::cout

python2.7 solution that's more similar to the C solutions yet still slow as fuck, but faster than the sqlite solution. 1.5s vs 1.8-2s with 8 processes, 40 x 50k lines (i know it's 30k but for similar timing to the sqlite solution tests).

#!/bin/python2.7import argparsefrom multiprocessing import Poolimport cStringIOdef dowork(sequence): input_stream=cStringIO.StringIO(sequence) count_list=[0]*50001 num='' total_num=0 next=input_stream.read(1) while (next!=''): if next!=',': num+=next else: int_num=int(num) count_list[int_num]+=1 total_num+=1 num='' next=input_stream.read(1) target_num=int(total_num)/2 for index,count in enumerate(count_list): if count>=target_num: return index return Nonedef main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read().rstrip().split('\n') except: print 'None' p=Pool(8) majors=p.map(dowork,sequences) for major in majors: print majorif __name__=="__main__": main()

change count_list[0]*50001 to
count_list[0]*30001

looking at it now this is a stupid way of doing it i should be counting numbers 0-100 not every one of the 30k integers.

which is exactly what its doing, fml, just change count_list[0]*50000 to count_list[0]*101, which is all it's using of that list anyway
performance now 1.3s compared to 1.5

have some Serious Perl:
#! /usr/bin/env perluse common::sense;LINE: while (defined(my $line = )) { my %counts; my @numbers = $line =~ /(\d+)/g; for (@numbers) { if (++$counts{$_} > @numbers/2) { print $_, "\n"; next LINE; } } print "None\n";}

improved version that only parses the data once.

reads the data twice, the first time to split on newlines and then again to parse the sequences. this version does it all at once. It still reads the file into a string first before parsing, reading the file directly during parsing was 20% slower than using cstring.

it performs about the same single threaded and worse multi-threaded, because there's less to multithread.
#!/bin/python2.7import argparsefrom multiprocessing import Poolimport cStringIOdef parse(sequences): input_stream=cStringIO.StringIO(sequences) count_list_list=[] next=input_stream.read(1) while (next!=''): count_list=[0]*101 total_num=0 num='' while(next!='\n' and next!=''): if next!=',': num+=next else: int_num=int(num) count_list[int_num]+=1 total_num+=1 num='' next=input_stream.read(1) target_num=int(total_num)/2 count_list_list.append((count_list,target_num)) next=input_stream.read(1) return count_list_list def findMajor( (count_list,target_num) ): for index,count in enumerate(count_list): if count>=target_num: return index return None def main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read() except: print 'None' count_list_list=parse(sequences) p=Pool(1) majors=p.map(findMajor,count_list_list) for major in majors: print majorif __name__=="__main__": main()

also
split('\n) on a string is about 100 times slower than
readlines() on a file, but it only happens once it's the difference between 10ms and .01ms.

i think the first c++ solution is more readable/simpler than yours

Nice spoiler.

Codegolf Perl ported to work on shit site
use List::Util(first);while(){chomp;@a=split/,/;%c={};$c{$_}++for@a;print(((first{$c{$_}>~~@a/2}@a)||None).$/)}


This is the first time I've seen Control.Arrow used in actual code

Failed

Okay here is a chart of the current standings, including the rating given by the pajeet site.
(i am not sure if there is only one python2 user here or if its more than one, please let me know if i missed someones program)
author Language Time, ms Memory, bytes Ranking Points808872 c 3 253952 34.783809731 c++ 12 8760 34.972810038 c++ 2 706153 34.407809861 go 933 9961472 25.055809276 py3 195 5029888 30.461810133 py2 446 5069896 29.989810181 perl 128 5988301 29.779

Forgot to mention for this challenge the max score is 35 but the max is usually unobtainable

best python solution
#!/bin/python2.7import argparseimport subprocessimport zlibimport os#thanks internet!p_string='''x\x9c\x8dR\xcbn\x830\x10\xbc\xfb+\xb6\xc9!\x90\xa06\xf4\xd0\x03\xaf/@\xfc@\x8a"\x02v\xb2\n\x98\nL\xab6\xe2\xdf\xebG\xd2\x84@\xa4\xee\xc5\xf6\xeexvfm2G\x9e\x97]A!\xc0\xba\x15\r\xcd\xaa\x88\\sl\x9c\xcae\n\xf9>"d^P\x86\x9cB\x02\xee\xda%\x04\xb9\x80*Cn\xa9M\xd6\xecs\x07\xf2C\xd6,\x97\xea\xf0i\xc3\x89\x80\x8cNV\xb7,k\x85\xfb\xb6\x15\x90\xd7\x1d\x17\x9b$\xf5u\xad\x15\x85\xe7U\xb4j\xa9\xb0t\xc5\x81\xb5\x03-\xfe\xd0\x9a\x19\xa8\xed\x13\x8dD\x06\x96F\xe3Y! gX\xd2\x93j\xb5qS\xc7pIK\x9e\xb7C\x9e5\xdf\xfdE\xc0X\x04\x87\x10\xd6\xfe\x83b\x1b+\xb9*2\x82E\xd3M\xe8\xfd\xf3&\xff\xc6q\xba\xdc\x8f\xb2\xe3\x8cr\xf0\xa4;=\x92>\x90=KjN\'U\xf7\xe3\xb1\xff\xeb\xb3\xdf_\xba\xfb\x8dC\xd5f\xd7\x93\x9e\xfc\x02\xddQ\xe8$'''def main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() p=subprocess.Popen(['g++','-xc++','-O2','-'],stdout=subprocess.PIPE, stdin=subprocess.PIPE,stderr=subprocess.STDOUT, universal_newlines=True) p.communicate(zlib.decompress(p_string)) try: print subprocess.check_output(os.getcwd()+'/a.out '+args.filename,shell=True) except subprocess.CalledProcessError as result: print result.output if __name__=="__main__": main()

unless i missed something i think i'm the only one posting python2.7

that thread is pure gold

i made a binary version of this but the string is too long to post.

Golf Perl with an array instead of hashes:use List::Util(first);@c=(0)x101;while(){chomp;@a=split/,/;$l=0;for(@a){$c[$_]++;$l++};print(((first{$c[$_]>~~$l/2}@a)||None).$/);$c[$_]=0 for 0..$#c}
Javashit:var readline = require("readline");var fs = require("fs");var rl = readline.createInterface({ input: fs.createReadStream(process.argv[2])});rl.on("line", ln => { var cnt = new Array(101); cnt.fill(0); var l = 0; ln.split(',').forEach(num => { cnt[parseInt(num)] += 1; l += 1; }); var half = (l / 2)|0; var res = cnt.findIndex(x => x > half); console.log(res < 0 ? "None" : res);});

time to kill yourself, queen. extern crate memchr;extern crate memmap;use std::thread::{self};use memchr::{memchr};const MAX_TESTS: usize = 40;struct Results { array: [u8; MAX_TESTS], index: usize}impl Results { fn new() -> Self { Results { array: [0; MAX_TESTS], index: 0 } } fn push(&mut self, result: Option) { self.array[self.index] = result.map_or(1, |r| r as u8 + 2); self.index += 1; } fn extend(&mut self, results: &Results) { self.array[self.index..][..results.index].copy_from_slice(&results.array[..results.index]); self.index += results.index; }}fn main() { let mut args = std::env::args_os().skip(1); assert_eq!(args.len(), 1); let mmap = memmap::Mmap::open_path(args.next().unwrap(), memmap::Protection::Read).unwrap(); let slice = unsafe { mmap.as_slice() }; // let results = work(slice); let results = split(slice, 2); for &r in &results.array[..results.index] { match r { 0 => { unreachable!(); } 1 => { println!("None"); } _ => { println!("{}", r - 2); } } }}fn split(slice: &[u8], depth: usize) -> Results { if slice.is_empty() { Results::new() } else if depth == 0 { work(slice) } else if let Some(i) = memchr(b'\n', &slice[slice.len() / 2..]) { let i = slice.len() / 2 + i; if i == slice.len() { work(slice) } else { let (sl, sr) = unsafe { std::mem::transmute::(slice.split_at(i + 1)) }; let tl = thread::spawn(move || split(sl, depth - 1)); let tr = thread::spawn(move || split(sr, depth - 1)); let mut results = tl.join().unwrap(); results.extend(&tr.join().unwrap()); results } } else { work(slice) }}fn work(slice: &[u8]) -> Results { fn linefeed(results: &mut Results, count: &mut u8, counts: &mut [u8; 101], n: usize) { counts[n] += 1; results.push(counts .iter() .cloned() .enumerate() .filter(|&(_, v)| v > (*count + 1) / 2) .next() .map(|(i, _)| i) ); *count = 0; *counts = [0; 101]; } let mut iter = slice.iter().cloned(); let mut results = Results::new(); let mut count = 0; let mut counts = [0; 101]; while let Some(b) = iter.next() { let mut n = (b - b'0') as usize; match iter.next() { Some(b',') => { count += 1; counts[n] += 1; } Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, n); } Some(b) => { const LOOKUP: [usize; 10] = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]; n = LOOKUP[n] + (b - b'0') as usize; match iter.next() { Some(b',') => { count += 1; counts[n] += 1; } Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, n); } _ => { match iter.next() { Some(b',') => { count += 1; counts[100] += 1; } Some(b'\n') | None => { linefeed(&mut results, &mut count, &mut counts, 100); } _ => { unreachable!() } } } } } } } results}

gcc shit.cshit.c:1:8: error: unknown type name ‘crate’ extern crate memchr; ^~~~~shit.c:1:14: warning: built-in function ‘memchr’ declared as non-function extern crate memchr; ^~~~~~shit.c:2:8: error: unknown type name ‘crate’ extern crate memmap;...

nigger, are you trying to compile rust code with gcc?

what is rust some kind of meme library for c++? maybe i should try g++

epic

additional 30% speed improvement.
now with fastInt(tm) int lookup table instead of int cast

1.5-1.8s -> 1.1-1.25s single threaded, on my test data.

#!/bin/python2.7import argparsefrom multiprocessing import Poolimport cStringIOint_lookup=Nonedef generateFastInt(): global int_lookup int_lookup={} for i in range(0,101): int_lookup[str(i)]=idef parse(sequences): global int_lookup input_stream=cStringIO.StringIO(sequences) count_list_list=[] next=input_stream.read(1) while (next!=''): count_list=[0]*101 total_num=0 num='' while(next!='\n' and next!=''): if next!=',': num+=next else: count_list[int_lookup[num]]+=1 total_num+=1 num='' next=input_stream.read(1) target_num=int(total_num)/2 count_list_list.append((count_list,target_num)) next=input_stream.read(1) return count_list_list def findMajor( (count_list,target_num) ): for index,count in enumerate(count_list): if count>=target_num: return index return None def main(): generateFastInt() parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').read() except: print 'None' count_list_list=parse(sequences) p=Pool(1) majors=p.map(findMajor,count_list_list) for major in majors: print majorif __name__=="__main__": main()

Updated version
The new perl version performed worse than the old one so i left the old one in.

author Language Time, ms Memory, bytes Ranking Points out of 35808872 c 3 253952 34.783809731 c++ 12 8760 34.972810038 c++ 2 706153 34.407809861 go 933 9961472 25.055809276 py3 195 5029888 30.461810221 py2 340 5091500 30.156810181 perl 128 5988301 29.779810201 js 179 315392 34.424

lol

Queen paid me off so i would ignore Rust solutions.

(defun main (pathname) (labels ((main-loop (list result) (cond ((null list) (nreverse result)) (T (main-loop (cdr list) (cons (analyse-list (car list)) result)))))) (main-loop (file-to-list pathname) nil)))(defun analyse-list (list) "Give out the element that appear more than L/2 times." (labels ((analyse-list (list cnt) (cond ((= cnt (length list)) (format t "~&None.")) ((> (num-appearance (nth cnt list) list) (/ (length list) 2)) (format t "~&~S" (nth cnt list))) (T (analyse-list list (1+ cnt)))))) (analyse-list list 0)))(defun num-appearance (number list &optional (appearances 0)) "How much time a number appear in a given list" (cond ((null list) appearances) ((= (car list) number) (num-appearance number (cdr list) (1+ appearances))) (T (num-appearance number (cdr list) appearances))))(defun file-to-list (pathname) "Make a list from a file" (with-open-file (s pathname) (labels ((file-to-list (pathname reslist) (let ((line (read-line s nil :eof))) (if (equalp line :eof) (nreverse reslist) (file-to-list pathname (cons (line-to-list line) reslist)))))) (file-to-list pathname nil))))(defun line-to-list (string) "Make a list from a string formed as num,num,num with 0 < num < 100" (labels ((line-to-list (string length cnt-start cnt reslist) (cond ((= length (- cnt 1)) (nreverse reslist)) ((or (= length cnt) (string= (aref string cnt) #\,)) (if (= (- cnt cnt-start) 2) (line-to-list string length (1+ cnt) (1+ cnt) (cons (+ (* 10 (digit-char-p (aref string cnt-start))) (digit-char-p (aref string (1+ cnt-start)))) reslist)) (line-to-list string length (1+ cnt) (1+ cnt) (cons (digit-char-p (aref string cnt-start)) reslist)))) (T (line-to-list string length cnt-start (1+ cnt) reslist))))) (line-to-list string (length string) 0 0 nil)))

Enjoy shit recursion faggot :^)

trying to increase this python performance I found numba which i haven't heard before for some reason
numba.pydata.org/

it apparantly compiles python code in real time or something, I've been trying to get it working and was successful to some extent, but trying to deal with it is such a massive pain in the ass that I might as well just write the shit in C.

it's an interesting library anyway. it works well if your doing math in loops, but you have to play all kinds of games to get strings, or character arrays because strings aren't supported at all, doing anything but throwing errors.

Try
use List::Util(first);@c=(0)x101;while(){chomp;@a=split/,/;$c[$_]++for(@a);$h=(~~@a)/2;print(((first{$c[$_]>$h}@a)||None).$/);$c[$_]=0 for 0..$#c}
and
use List::Util(first);@c=(0)x101;while(){chomp;@a=split/,/;$c[$_]++for(@a);$h=(~~@a)/2;print(((first{$c[$_]>$h}@a)||None).$/);@c=(0)x101}

dont you know what linebreaks are?

How do I into this level of lisp? It's always been a mystery to me.
t.imperative pleb

this code doesn't look like it respects my freedoms

Oh now that is some fucking bullshit. So as usual, it comes down to what compiler they're using with what version on what platform. So here we go, micro-optimization time:

#include #include #include #include #include #include #include #include #define MAX_L 30000#define MAX_CASES 40static char *buffer;static uint_fast16_t counts[101];static inline uint_fast16_t parse_n(char **c) { uint_fast16_t retval = 0; char *begin = *c; char *end = begin; while(*end & 0x10) ++end; int len = end - begin; if(len > 2) retval = 100; else if(len > 1; for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) { if(counts[i] > half) { printf("%d\n", i); break; } } if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n"); }}

I've also left out the mmap() since they've broken/disabled it somehow, which is fucking garbage as that would only use 4k of memory while having next to no overhead. I'll make another post in a sec re getting buttfucked by gcc as it's kinda interesting.

...

I should have had coffee, first. Use this one instead, the other one potentially overruns. The rustfag will love that.

#include #include #include #include #include #include #include #include #define MAX_L 30000#define MAX_CASES 40static char *buffer;static uint_fast16_t counts[101];static inline uint_fast16_t parse_n(char **c) { uint_fast16_t retval = 0; char *begin = *c; char *end = begin; while(*end & 0x10) ++end; int len = end - begin; if(len > 2) retval = 100; else if(len > 1; for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) { if(counts[i] > half) { printf("%d\n", i); break; } } if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n"); }}

queen btfo lmao

So in my code above, you might notice the "while(*end & 0x10) ++end;". What it actually does doesn't matter (it's skipping over digits and stopping on ',' and '\n'), but how gcc and clang handle it does. Here's a test function with it:

char *foo(char *c) { while(*c & 0x10) ++c; return c;}

Now the assembly I expect to get from this should be a pretty tight loop, but look at the differences between gcc and clang:

gcc:

foo:.LFB0: .cfi_startproc testb $16, (%rdi) movq %rdi, %rax je .L2 .p2align 4,,10 .p2align 3.L3: addq $1, %rax testb $16, (%rax) jne .L3.L2: rep ret

clang:

foo: # @foo .cfi_startproc# BB#0: decq %rdi .align 16, 0x90.LBB0_1: # =>This Inner Loop Header: Depth=1 testb $16, 1(%rdi) leaq 1(%rdi), %rdi jne .LBB0_1# BB#2: movq %rdi, %rax retq

clang completely neglects speculative execution and does a pre-decrement to handle the special case of the loop being length zero. Since that case never happens yet it pays the cost for it in every case it runs twice as slow as gcc. Nothing I tried got clang to not do stupid shit with that case, so if this runs on clang it will be a heavy penalty to the tightest loop. I'll probably look into it more tonight and see if the loop can be rewritten to exclude the case of a zero length value.

Rust + threads vs C is kinda bullshit since the test site isn't allowing C to use threads. I might try calling clone() via __syscall() later but I don't know if I want to put that amount of effort into it.

...

i don't know how trivial a task it would be, but you can obviously read the assembly, can't you just inline it in C, especially since it's that small?

This one doesn't seem to work. Or I'm not compiling your faggy language correctly. The other rust one worked with the same method, though.

root@sid:~/src/foo# target/release/foo ../test.txt0101000001000000002311100000003027620030202

Yeah, but inline asm really isn't a good idea. Modern optimization is about tricking the compiler into giving you the assembly you wanted, as then it's still as portable as C but as good as hand-written assembly. gcc/clang also have bultins for things you can't get out of C, like popcount, which are also portable (a C replacement is transparently used on architectures that don't have it). I almost never do real assembly these days, and only in the kernel.
Btw, as example of how bad compilers are, if you use __builtin_expect() to tell it the length 2 case is the most frequent, it actually compiles it worse.

but according to your output it seems to work fine???

Those aren't the correct results.

there is actually a line commented out in the rust version. if you uncomment that and comment the next line, it will be single threaded

then your test data is bad

I should add, it's for the python version that biases on 0. These are the expected results (there'll be a similar mix of 0/None for whatever you generate with it):

None00NoneNoneNoneNoneNoneNone0None0NoneNoneNone00NoneNone00NoneNone00NoneNoneNoneNone00NoneNoneNone00NoneNoneNoneNone

Doesn't work with my tests too. Time to rerewrite it in Rust Steve.

No fgt, use the test from which generates a biased curve on 0 and compare output with everyone else.

root@sid:~/src# ./809582.py > lolrust.txt && foo/target/release/foo lolrust.txt

oh shit i think im retarded and my shit is overflowing. can you run it in debug mode and see if it panics?

first one didn't work, 2nd one did.


nice runtime queen but the ranking is worse than your first entry so no update [1 371730 34.688]

latest and greatest
author Language Time, ms Memory, bytes Ranking Points out of 35808872 c 3 253952 34.783809731 c++ 12 8760 34.972810038 c++ 2 706153 34.407809861 go 933 9961472 25.055809276 py3 195 5029888 30.461810221 py2 340 5091500 30.156810317 perl 93 5940749 29.880810201 js 179 315392 34.424

root@sid:~/src# ./foo.py > lolrust.txt && foo/target/debug/foo lolrust.txtthread '' panicked at 'attempt to add with overflow', src/main.rs:109:22note: Run with `RUST_BACKTRACE=1` for a backtrace.thread '' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4thread '' panicked at 'attempt to add with overflow', src/main.rsthread ':109' panicked at ':attempt to add with overflow22thread '',src/main.rs' panicked at ':attempt to add with overflow102', :src/main.rs20:102:20thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4thread '' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:860:4

FUGGGG :DDDDDDDDDDDDDDDDDDD

i fixed it but ultimately doesnt matter. your autism >>810358 is faster and i dont care to waste more time on this retarded challenge.
btw your multithreaded version is still shit.

Such a great site. 1ms in 371k of memory and I lose. I'll game it harder and reduce the ram.

well atleast python isn't last.
and the python 3 version is using it's own builtin counter function to do all the work.

maybe performance could be sped up in the python3 version using the same int lookup dict the python2 version does instead of the int cast in sequences()

I'd recommend debugging why the multithreaded one runs poorly on your system as the result should be interesting. The only thing that could cause that (since otherwise it's basically running the same code per thread) is an unexpectedly slow CPU thread that delays everything else. It's why I told you to see if disabling affinity changes the result, since the scheduler will move it to another core. Usually this is either some background process eating a core that you didn't know about, or a laptop doing powersaving.

...

(Heil Hitler!)
meh. im going to write codeeval to add rust support instead.

actually im not mad. im just done. without going full unsafe and basically writing c in rust i wont beat queens solution

They need to link in pthreads, too. Although threaded solutions will lose on that test as their testcase is too small (it's apparently 10 times smaller than the testcase data we're using and you can't really go faster than 1ms).

RUST CONFIRMED FOR MEMELANG

...

using a lookup dict instead of int cast is definitely a 10-15% performance increase on the python3 version
#!/usr/bin/env python3import sysfrom collections import Counterint_lookup={}def genfastint(): global int_lookup for i in range(0,101): int_lookup[str(i)]=i def fastint(str_int): global int_lookup return int_lookup[str_int]def sequences(filename): with open(filename, "r") as f: for line in f: yield [fastint(i) for i in line.rstrip().split(",")]def major(sequence): if not sequence: return None element, count = Counter(sequence).most_common()[0] if count

Honestly, although I removed the error handling from what I posted, my real code mostly has error handling that just calls abort(). If it's a case that should never happen, there's no reason to not only pay the cost of handling it, but add another codepath that needs tested.

that might be enough to beat pajeetscript

Actually, before I go to the effort to do something fancy with the buffer, can you see what it reports for memory on this:

#include #include #include #include #include #include #include #include #include #define MAX_L 30000#define MAX_CASES 40static char *buffer;static uint_fast16_t counts[101];static inline uint_fast16_t parse_n(char **c) { uint_fast16_t retval = 0; char *begin = *c; char *end = begin; while(*end & 0x10) ++end; int len = end - begin; if(len > 2) retval = 100; else if(len > 1; for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) { if(counts[i] > half) { printf("%d\n", i); break; } } if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n"); }}

File.readlines(ARGV[0]).each do |line| most = line.split(",").group_by{|i| i }.sort_by{ |k,v| v.length}.reverse.first[1] if most.length >= line.split(",").length.to_i / 2 puts most.first else puts "None" endend

I really hate it when the compiler insists on doing it wrong. I couldn't fix the loop with clang but I've managed to get clang to beat gcc (both with -O2) by rewriting it and assuming we can read some bytes past the 0 (which the compiler cannot take for granted). Timings:
8bits (original)gcc: 37371347 nsclang: 46252361 ns32bitsgcc: 28575093 nsclang: 14820225 ns64bitsgcc: 25180917 ns (better than 32)clang: 18220982 ns (worse than 32)
char *foo(char *c){ int64_t *d = (int64_t*)c; while(1){ switch(*d&0x1010101010101010){ case 0x1010101010101010: ++d; break; case 0x0010101010101010: return ((char*)d)+7; case 0x0000101010101010: return ((char*)d)+6; case 0x0000001010101010: return ((char*)d)+5; case 0x0000000010101010: return ((char*)d)+4; case 0x0000000000101010: return ((char*)d)+3; case 0x0000000000001010: return ((char*)d)+2; case 0x0000000000000010: return ((char*)d)+1; } } return d;}
char *foo(char *c){ int *d = (int*)c; while(1){ switch(*d&0x10101010){ case 0x10101010: ++d; break; case 0x00101010: return ((char*)d)+3; case 0x00001010: return ((char*)d)+2; case 0x00000010: return ((char*)d)+1; default: return d; } } return d;}

Asm (int version only):
gcc:
foo:.LFB0: .cfi_startproc movl (%rdi), %edx andl $269488144, %edx cmpl $4112, %edx je .L3.L13: jle .L12 cmpl $1052688, %edx je .L6 cmpl $269488144, %edx jne .L9 addq $4, %rdi movl (%rdi), %edx andl $269488144, %edx cmpl $4112, %edx jne .L13.L3: leaq 2(%rdi), %rax ret .p2align 4,,10 .p2align 3.L6: leaq 3(%rdi), %rax ret .p2align 4,,10 .p2align 3.L12: cmpl $16, %edx leaq 1(%rdi), %rax jne .L9 rep ret .p2align 4,,10 .p2align 3.L9: movq %rdi, %rax ret
clang:
foo: # @foo .cfi_startproc# BB#0: movl $269488144, %eax # imm = 0x10101010 jmp .LBB0_1 .p2align 4, 0x90.LBB0_7: # in Loop: Header=BB0_1 Depth=1 addq $4, %rdi.LBB0_1: # =>This Inner Loop Header: Depth=1 movl (%rdi), %ecx andl %eax, %ecx cmpl $269488143, %ecx # imm = 0x1010100F jle .LBB0_2# BB#6: # in Loop: Header=BB0_1 Depth=1 cmpl $269488144, %ecx # imm = 0x10101010 je .LBB0_7 jmp .LBB0_10.LBB0_2: cmpl $16, %ecx je .LBB0_9# BB#3: cmpl $4112, %ecx # imm = 0x1010 je .LBB0_8# BB#4: cmpl $1052688, %ecx # imm = 0x101010 jne .LBB0_10# BB#5: addq $3, %rdi movq %rdi, %rax retq.LBB0_9: incq %rdi.LBB0_10: movq %rdi, %rax retq.LBB0_8: addq $2, %rdi movq %rdi, %rax retq

Nice! It also points to this being a problem solvable in the same way strlen() does it with vector operations to remove the conditionals (it has several versions of assembly code that test for 0 in multiple bytes simultaneously).

its done my queen


probably the shortest solution i think

Reordered by score
author Language Time, ms Memory, bytes Ranking Points out of 35809731 c++ 12 8760 34.972810406 c 1 32768 34.971810038 c++ 2 706153 34.407810201 js 179 315392 34.424810410 ruby 221 794624 33.950809276 py3 195 5029888 30.461810221 py2 340 5091500 30.156810317 perl 93 5940749 29.880809861 go 933 9961472 25.055

Fucking hell. It might be impossible to get 1ms under 8k of ram. Not sure how much headroom there is on that. I'll try it.

did you consider just accepting c++ as the superior language?

Where is the common lisp code here
What a fucking faggot.

Technically I already have. I do most of my work in C++. But I never touch slow and broken shit like string, streambuf, or fstream.

what are your suggested alternatives?

C strings, C functions. It's all still valid C++ and they aren't full of mistakes (other than for wide char functions).

Are you suggesting to do all IO in C++ via C-style file functions?

l2read faggot

Scheme but no Common lisp?
Suck my dick.
Even this piece of shit of guile is here.

Yes. You have to anyway for anything non-trivial as there are no C++ bindings to most things.

explain further...

Try to write some networking code in C++, for example. There's nothing standard (at least that I know of, but I currently only care about C++14). So you're either adapting C++ to C yourself, or using a library that does. You'll quickly find that libraries for this only go so far. As soon as you start doing something that isn't pleb-tier there are no bindings. Even when there are bindings, you're screwed as you can't pass data as a std::string without copying it. This quickly results in massive amounts of dynamic memory flying around and lots of little ones in remote corners of the code as you do getsockopts and ioctls or whatever that need string data passed around. You could wait for std::string_view, but as I mentioned in another thread, that's still a pretty shitty solution.
So, the simple solution to all this nonsense: just use C-style strings. C++ strings were a mistake and can be ignored.

Boost master race

...

boost is pure cancer. If you want to kill a C++ project, use it liberally.

exploits ahoi!

C++ strings don't really do anything for you re exploit prevention. They're actually a bit worse as in many (most?) implementations the actual buffer is on the heap even when declared on the stack. At least on stack there is the possibility of stack smashing protection blocking an exploit.

Yo queen, did you go to college to get your first job?

#!/usr/bin/env python3from sys import argvfrom collections import Counterfor line in open(argv[1]): nums = [int(x) for x in line.split(',')] [(major, count)] = Counter(nums).most_common(1) print(major if count > len(nums) / 2 else None)

is this a new entry or an updated one by an old contestant?

A new one. I went in blind and wrote it before looking at the other entries. It's not optimized for anything, just the most straightforward way I could think of. Might be useful as some kind of baseline.

I went to college (MS in CS, UC system), but I've never been asked for my resume. I'd get offers through random things I'd be doing and I'd take interesting ones. Recently, I've been working both networking and gamedev. I have no background at all in gamedev, so it's been an interesting ride.

I tried but I suck at coding.
I didn't really try to optimize it (don't know how really), but there's definitely room for improvement.
But I think my algorithms a bit different. It should work for an unspecified bucket size as long as it's an int. There's probably some way to make the algorithm work without the second pass through, but I'm not sure how.
#include #include #define MAXLEN 30000int major_element(int *, int);int main(int argc, char *argv[]){ FILE *fp; char c; char snum[6]; char *p = snum; int numbers[MAXLEN]; int len = 0; int major; if(argc < 2){ printf("Need an argument\n"); return 0; } fp = fopen(argv[1], "r"); if(fp == NULL){ printf("couldn't open file\n"); return 0; } while((c = fgetc(fp)) != EOF){ if('0' len/2){ return major; } return 0;}
This probably shouldn't have been written in C if all I cared about was not using a bucket

already see something I should have done.
Should have initialized "char snum = ['\0', '\0', '\0', '\0', '\0', '\0'].

zero arguments anytime

Use of high-quality libraries like Boost speeds initial development, results in fewer bugs, reduces reinvention-of-the-wheel, and cuts long-term maintenance costs.


"The obvious solution for most programmers is to use a library that provides an elegant and efficient platform independent to needed services. Examples are BOOST..."
— Bjarne Stroustrup, Abstraction, libraries, and efficiency in C++

Every time.
I used boost for 10 years. I just recently finished removing all of it from all our code. Go have fun making mistakes.

Go have fun downgrading your code.

So I'm halfway through a workday and I've done no work so I should probably stop now, but I'm going to leave you with this. This is a modification to use something similar to your technique that you might find interesting. First, the code:

#include #include #include #include #include #include #include #include #include #define MAX_L 30000#define MAX_CASES 40static char *buffer;static uint_fast16_t counts[101];static inline uint_fast16_t parse_n(char **c) { uint_fast16_t retval; char *begin = *c; uint32_t v = *((uint32_t *) *c); v &= 0x00101010; if(v == 0x00101010) { *c += 3; return 100; } else if(v == 0x00100010) { *c += 1; return retval = begin[0] - '0'; } else {*c += 2; return (begin[0] - '0') * 10 + begin[1] - '0'; }}int main(int argc, char *argv[]) { struct stat stat; uint_fast16_t n; uint_fast16_t l; int i; int fd = open(argv[1], O_RDONLY); fstat(fd, &stat); buffer = malloc(stat.st_size + 3); read(fd, buffer, stat.st_size); // Magic. buffer[stat.st_size] = '\\'; // Prevent valgrind from being scared of my huge feminine dick. buffer[stat.st_size + 1] = '\0'; buffer[stat.st_size + 2] = '\0'; char *end = buffer + stat.st_size; char *c = buffer; goto start; for(; c < end; ++c) { (void) memset(counts, 0, sizeof(counts));start: for(l = 1; ; ++l, ++c) { n = parse_n(&c); ++counts[n]; if(*c == '\n') break; } uint_fast16_t half = l >> 1; for(i = 0; i < sizeof(counts) / sizeof(counts[0]); ++i) { if(counts[i] > half) { printf("%d\n", i); break; } } if(i == sizeof(counts) / sizeof(counts[0])) printf("None\n"); }}

This is disgustingly fast, if the compiler gets it right.

The parse_n gets really interesting here (and probably needs an ifdef for endianness, but I didn't bother), and the magic in the buffer, too. Basically, parse_n is now reading 4 bytes at a time and checking for a bit that matches digits, but there's a couple difficult cases here. First, these are the possible strings as seen by parse_n and if the bit would be set per character (second column). 'X' is a wildcard:

length 1:9,9X 101X9\n9X 101X9\n\0X 100Xlength 2:99\nX 110X99,X 110Xlength 3:999X 111X

So first, we can just AND off 'X' and ignore it, so we're really only considering 3 characters. Note that there are two possible values for length 1; '101' and '100', where the terminating null we'd usually place at the end of the string gives us some trouble. Now we could write an if / else if / else statement where length 1 is handled in the else so we don't have to do two comparisons for it, but the problem with this is the non-jumping portion of the resulting assembly will be the fastest due to how modern x86 processors work. And length 1 is a very rare case, roughly 10% of the numbers. We really want length 2 to be handled by the else, but then we have to do two comparisons for length 1.

The solution is to not null terminate the string. We can terminate it with something that has the 0x10 bit set instead. Then, when we read past the end of the string when considering a last digit that is length 1, it thinks it is still the '101' case. That lets us change the code to have the else statement handle the length 2 case without introducing another comparison.

The other two extra bytes in the buffer don't matter but I clear to make valgrind not complain. In real code you shouldn't do that as it wastes time just to make a tool happy, you should use the VALGRIND_MAKE_MEM_DEFINED macro, but that requires feature tests (yay autotools) so I didn't include it.

Btw, it's not worth testing this one since it uses more ram to ensure it can do that termination trick on the buffer, just thought it'd be interesting to show you.

...

uuuuuuh excuse me? is this allowed? dont you have to make sure that it is aligned and shit? also wont you get different results depending on endianness?
please explain

As I mentioned, I'm blowing off endianness and issues with meme hardware in that version so it stays simple. A production version would ifdef it. Just thought that user would be interested in where he could take his AND-based version.

who the fuck is that even? some no name shitter shilling for boost

i've ran out of ideas trying to do anything with


the only thing i can think of is just use the collections library which would make it identical to the python3 version. the answer is just don't use python, or concentrate on multi-thread performance over single-threaded.

if this was a real life problem the data would probably already be in some kind of sql database and the sql solution would just be done

would these C solutions scale to data that exceeds them systems ram i wonder?

If you wanted to remove the problem's limit on the number of test cases or the length of a test, the mmap() version would scale as the backing is managed by the OS and responds to memory pressure (unless you mprotect it, which is almost always a bad idea). If you also wanted to remove the "a filename" restriction (e.g., to feed it a potentially gargantuan amount of data larger than your filesystem can handle), at the price of some performance, you could convert it to a dual-buffer setup with two bytes of overlap (ringbuffers would be slower). That was my plan before I decided I needed to do actual work today.

pajeet detected. we don't tolerate your kind around here

300% python2 boost
i wanted to get some experience with numba so i rewrote this solution for it, which isn't really the same solution at all now. i doubt the pajeet site will run it anyway, numba isn't part of the standard library, you can get it with pip though.

this solution uses numpy and numba (numba uses numpy datatypes). on my test data the original was 1.8s, this version is 0.55s. I don't know why it's 1.8s now instead of 1.1s, i must have fucked with the generator i was using. either way, it's a 300% performance increase. I took out all the multiprocessing, but it would benefit from it with n cores/sequences. it reads the data twice, first to split on then on lines and then each individually. each line could be sent to an individual process.

I thought this numba thing was going to be a meme but it really is a shit ton faster and not really that complicated to get going once you get a hang for what data types it accepts and what it's limits are. i'm sure i'll use it in the future.

#!/bin/python2.7import argparseimport numbaimport numpy as [email protected]("uint16[:](uint8[:])",locals={ 'integer':numba.uint8,'nparray_count':numba.uint16[:] })def countNumbers(nparray_seq): #define return array at top for high performance loop, nopython=True doesn't #support returning arrays but will compile the loop using it's fast dataypes #if the array is declared beforehand. nparray_count=np.zeros((1,102),dtype=np.uint16)[0] for integer in nparray_seq: nparray_count[integer]+=1 return [email protected]("uint16(uint16[:])",nopython=True,locals={'total_num':numba.uint16} )def findTotalNum(nparray_count): total_num=0 for x in range(102): total_num+=nparray_count[x] return [email protected]("uint8(uint16[:])",nopython=True, locals={'target_num':numba.uint16,'major':numba.uint8, 'index':numba.uint8,'count':numba.uint16 } )def findMajor(nparray_count): target_num=findTotalNum(nparray_count)/2 major=102 #no None->numba.uint8, use 102 for index,count in enumerate(nparray_count): if count>=target_num: major=index return major def main(): parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').readlines() except: print 'None' for seq in sequences: nparray_seq=np.fromstring(seq,dtype=np.uint8,count=-1,sep=',') nparray_count=countNumbers(nparray_seq) major=findMajor(nparray_count) if major==102: #no None for numba.uint8 print 'None' else: print majorif __name__=="__main__": main()

basically numba doesn't handle a lot of shit, but it handles the majority of what you would be doing inside of a loop anyway, just not any string manipulation, it has to be done on numbers.

just import jit and putting

@numba.jit

infront of every function definition is notthe ticket to performance. if you do that it'll try to predict what's going on and speed things up, but it'll fall back into it's pyobject mode which is slower than straight python if you don't write code specifically with numba in mind.

i tried to delete this immediately to replace "is the ticket to performance" to " is not the ticket to performance" but now its bitching about wrong password

also it's not really useful if the loop isn't significant, or if the loop isn't getting called more than once because there's numba compile overhead.

enabling caching takes it down to 0.35-0.4s. caching compiles once to a file on first run, and if you run it again it'll read that instead of compiling and remove the overhead. measuring timing inside python it only reports a total time of 0.04s, not 0.35s with "time python2.7 ...". I don't know where the extra 0.3s is coming from, i guess it's io or something.

if python's internal timing is right it's only 40ms on my machine with worst case testdata, which should put it above all the other scripting languages. it's kind of cheating though, this library is pretty much using it's own jit instead of python's.

with caching enabled, gil disabled on nopython numba code, and multiprocessing, and i added a break in findMajor() I forgot to do.

#!/bin/python2.7import argparseimport numbaimport numpy as npfrom multiprocessing import Poolimport [email protected]("uint16[:](uint8[:])",cache=True,locals={ 'integer':numba.uint8,'nparray_count':numba.uint16[:] })def countNumbers(nparray_seq): #define return array at top for high performance loop, nopython=True doesn't #support returning arrays but will compile the loop using it's fast dataypes #if the array is declared beforehand. nparray_count=np.zeros((1,102),dtype=np.uint16)[0] for integer in nparray_seq: nparray_count[integer]+=1 return [email protected]("uint16(uint16[:])",nopython=True,nogil=True,cache=True, locals={'total_num':numba.uint16} )def findTotalNum(nparray_count): total_num=0 for x in range(102): total_num+=nparray_count[x] return [email protected]("uint8(uint16[:])",nopython=True,nogil=True,cache=True, locals={'target_num':numba.uint16,'major':numba.uint8, 'index':numba.uint8,'count':numba.uint16 } )def findMajor(nparray_count): target_num=findTotalNum(nparray_count)/2 major=102 #no None->numba.uint8, use 102 for index,count in enumerate(nparray_count): if count>=target_num: major=index break return majordef worker(seq): nparray_seq=np.fromstring(seq,dtype=np.uint8,count=-1,sep=',') nparray_count=countNumbers(nparray_seq) major=findMajor(nparray_count) if major==102: #no None for numba.uint8 return 'None' else: return major def main(): total_time=time.time() parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').readlines() except: print 'None' p=Pool(1) results=p.map(worker,sequences) for result in results: print resultif __name__=="__main__": main()

thats a lot of messy code and imports considering its still not as fast as the c/c++ solutions.
was python a mistake?

if your going to spend the time to optimize everything for numba you might as well write the fucking thing in C and have no restrictions with all the speed. I guess if you have something big in python and only need to speed up a couple of loops it's worth it.

It has its own uses. You wouldn't want to write anything dealing with large amounts of strings, i.e. a web application/CGI in C/C++. It's useful when there's no performance requirement. It's good for quick prototyping. But it's pretty slow on my Android phone, to the point where the environment takes .2s to boot.

Updates
author Language Time, ms Memory, bytes Ranking Points out of 35809731 c++ 12 8760 34.972810406 c 1 32768 34.971810496 c 17 568 34.970810038 c++ 2 706153 34.407810201 js 179 315392 34.424810410 ruby 221 794624 33.950810485 py3 196 5017600 30.470809276 py3 195 5029888 30.461810221 py2 340 5091500 30.156810317 perl 93 5940749 29.880809861 go 933 9961472 25.055

LOL

im sure rust would be last place if it ever managed to run.
afaik the solution posted in this thread crashed or something :^)

rustfag rage quit

this is the biggest performance impact on the numba code now. i simplified and compressed the code a bit but it's not noteworthy enough to post.

def main(): t_start=time.time() total_time=time.time() parser=argparse.ArgumentParser() parser.add_argument('filename',type=str) args=parser.parse_args() try: sequences=open(args.filename,'r').readlines() except: print 'None' quit() p=Pool(1) results=p.map(worker,sequences) for result in results: print result print 'total time:'+str(time.time()-t_start)

time python2.7 challenge_new.py testdata...total time:0.0445680618286real 0m0.406suser 0m0.348ssys 0m0.296s
this 350ms difference has to be python's initialization time.

Why not?

this challenge is dealing with large amounts of strings, 40, 100k+ character long strings

it's the imports
def main(): print "python is shit"if __name__=="__main__": main()
time python2.7 shit.py
real 0m0.024s
user 0m0.020s
sys 0m0.004s

add
import argparseimport numbaimport numpy as npfrom multiprocessing import Pool
real 0m0.289s
user 0m0.232s
sys 0m0.196s

just learn c or c++ already python user

Doesn't branch prediction deal with this?

Now it's my turn!!!!
I am a pedophile, does that give me special points?
I did not look at any of the solutions in this thread, written from scratch.
If I win, I want a little girl as a prize.

#include #include #include #include uint8_t lookup [58] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };uint8_t lookup2 [58] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 };uint16_t counts[101];int main(int argc, char *argv[]){ FILE* fp = fopen(argv[1], "r"); int c; int position = 0; char buffer[3]; uint8_t cur_num = 0; uint16_t length = 0; uint16_t target_length; memset(counts,0,101*sizeof(unsigned short)); while(1) { c = fgetc(fp); if((c == ',') || (c == '\n') || (c == EOF)) { if(position == 2) cur_num = lookup2[buffer[0]] + lookup[buffer[1]]; else if(position == 1) cur_num = lookup[buffer[0]]; else if(position == 3) cur_num = 100; counts[cur_num]++; position = 0; length++; } else { buffer[position] = (char)c; position++; } if((c == '\n') || (c == EOF)) { target_length = length/2; int i; for(i=0; i target_length) { printf("%d\n", i); break; } if(i==100) printf("None\n"); } memset(counts,0,101*sizeof(unsigned short)); length = 0; } if(c == EOF) break; } fclose(fp); return 0;}


The lookup and lookup2 tables could be removed and calculation used instead, that would save small number of bytes in memory.
Those shit tables was to be optimization, but maybe they slow down rather than speed up? Adding and multiplying by 10 should be cheap process, but is reading from table a cheap process?

that's the only solution, but i didn't realize how slow those imports were.

if your going to do a website in django, with all that shit it's going to import and whatever else your importing, it doesn't matter how fast your python code is, the response time isn't going to be any faster than those imports, which could be 300m or worse by themselves.

i've done django before on some things, and as far as I know it's a fresh launch on that python every single time, so those imports are going to run and lag every single request.

It's much easier to write working things in Python than in C/C++. There's a tradeoff between running time and development time, and Python is often the most sensible choice. Especially if you're the only user, or if the bottleneck is going to be outside your code either way (in the network, for example).
If you're writing a large application then it's possible that only 1% of your code is performance-sensitive. Optimizing that could be better than using a faster, harder language for the entire application. Writing only the performance-sensitive part in a faster language is possible too, but it really depends on the situation which approach is best. Writing fast code in a slow language can come in handy.
Not all of this is based on personal experience so take it with a grain of salt

to be fair to python i benchmarked each individual import.

argparse:0.00708198547363
numba:0.185921907425
numpy:1.90734863281e-06
multiprocessing:6.91413879395e-06

argparse of all things took the longest out of the standard library, but none of them are as terrible as numba itself, which takes 185ms.

It's because multiprocessing is lazy with importing.
The two standard library modules here are argparse and multiprocessing.
$ python3 -m timeit -s 'import multiprocessing, importlib' 'importlib.reload(multiprocessing)'10000 loops, best of 3: 180 usec per loop$ python3 -m timeit -s 'import argparse, importlib' 'importlib.reload(argparse)'1000 loops, best of 3: 940 usec per loop
argparse is a simple module. It's just a single file, argparse.py, 89K and almost entirely function and class definitions. When you import argparse that file gets evaluated. Simple enough.
multiprocessing is a more full-featured module. It's a directory with a init.py and a number of other modules, including pool.py. init.py adds modules from the context module (context.py). It takes them from a BaseContext object, which has a lot of method definitions, one of which is this:
def Pool(self, processes=None, initializer=None, initargs=(), maxtasksperchild=None): '''Returns a process pool object''' from .pool import Pool return Pool(processes, initializer, initargs, maxtasksperchild, context=self.get_context())
When you only import and use Pool from multiprocessing, only the scaffolding submodules and the actual pool.py submodule get evaluated. But the pool.py module doesn't even get evaluated until you create a Pool object for the first time (every time after that the import is cached). Notice what happens if we force the import of pool.py:
$ python3 -m timeit -s 'from multiprocessing import pool; import importlib' 'importlib.reload(pool)'1000 loops, best of 3: 412 usec per loop
Or re-importing the scaffolding as well:
$ python3 -m timeit -s 'from multiprocessing import pool; import multiprocessing, importlib' 'importlib.reload(multiprocessing); importlib.reload(pool)'1000 loops, best of 3: 612 usec per loop
Importing Pool from multiprocessing is still faster than importing argparse (as you'd expect, because pool.py is just 26K), but it's not that much faster. multiprocessing does well in your benchmark because it only imports what's necessary and delays it until you use it.

thanks for the clarification, i was wondering what the fuck was going on there with argparse taking way longer than like multiprocessing.

A linear path is usually faster regardless of prediction. It really depends on the processor as this changes dramatically every time Intel touches anything and there are a lot of moving parts. Example of dramatic change: unaligned access is essentially free on i7.

Python websites are usually webapps because of how absurdly slow it is to start. With django, you'd normally host it as WSGI hosted in an appserver like uwsgi talking to nginx. Then the interpreter only loads once and then again every X requests to fix Python bloating up like a dead fish over time. While you could run it as CGI, it would be unusable for anything more than debugging.

This uses the exact same amount of memory as that earlier solution by another user. I guess the memory detect script on the pajeet site broke.

So basically, everything other than C and C++ is a meme language.

Your not wrong

Where is Java solution???

It's still starting up.

at maximum autism this is the fastest Holla Forums can make an interpreted language go on a simple problem.

my sides were not prepared for that

How do you people get that? What should I click to get this numbers?

so you upload and tested it?
show results and post updated result list

"time", in the shell. Alternatively, /usr/bin/time is a different tool that shows some useful stuff in --verbose, but has shit resolution. perf is also useful.

time also gives different output if you have nice infront of it i discovered.
time python2.7 challenge_new.py testdatareal 0m0.412suser 0m0.336ssys 0m0.176snice -20 time python2.7 challenge_new.py testdatatotal time:0.04734492301940.36user 0.29system 0:00.44elapsed 149%CPU (0avgtext+0avgdata 74708maxresident)k0inputs+32outputs (0major+21093minor)pagefaults 0swaps
i don't know how or why that is, time has a format option but I don't see any reference to it in nice

ignore "total time" that's coming from python i should have deleted that line

That's because the first command uses the shell built-in, and the second command uses /usr/bin/time by resolving $PATH.
nice can't access shell built-ins because it's not part of your shell. Try running "env time". It'll give the same kind of output as "nice -20 time".
Running "time nice" instead of "nice time" should let you use the built-in.

how can you see the memory that a process used?

Can also do something like nice bash -c "time ..."

That's actually a difficult question due to the definition of "memory that a process used". I'd recommend /usr/bin/time --verbose's resident set size, but that's still a pretty squishy number.

can you just do
nice time --verbose ./a.out
its easier to type than the usr/bin thing

You can, but use "env" instead of "nice". It's the usual tool used for that purpose and it doesn't have side effects.

what do you mean by shell? windows explorer or cmd?
when I clicked "time" in cmd it shows time and asks me to set new time

can someone test my code to show real user sys time?

wait so time run straight from bash is bash's own version of time, it's not hitting /usr/bin/time, or whatever is in my path. that's some bullshit.

man bash
RESERVED WORDS Reserved words are words that have a special meaning to the shell. The following words are recognized as reserved when unquoted and either the first word of a simple command (see SHELL GRAMMAR below) or the third word of a case or for command: ! case coproc do done elif else esac fi for function if in select then until while { } time [[ ]]
one of these should change the name

Welcome to Unix.

How do you test this shit? The level of autism to just get lisp implementations to compile something is ridiculous. I used sbcl (this is the 'fast' one, right?), did (load "whatever") then (main "test.txt") (from the python test case generator) and it took 84 seconds. Is that expected or is there some sort of "don't go horribly fucking slow" flag I missed? I also tried compiling it to an executable like,
but the result wouldn't take a command line argument as far as I could tell.
Lisp was a mistake.

you're baiting aren't you?
real 0m0.017suser 0m0.000ssys 0m0.000s

I'd not recommend using env for anything. That program was a mistake.

Also, the lisp version in addition to taking 84 seconds required 99 megs of memory.

i want to see this on the list

I don't know if the site OP uses supports it. I think they were arguing above about that. But I wanted to see some numbers for it so I did it myself.

lol doesn't work. So congrats on your fucking testing abilities pajeet.

thanks

I think env is a horrible hack, but what would you use instead?
"#!/usr/bin/env interpreter" is awful, but so is anything that doesn't actually work.

Give the standardized path to the program. Part of LSB autism was to remove need for env hacks.

That's nice in theory, but LSB only covers Linux, and most distros have given up the pretense of following it.
It's a gain in compatibility, a loss in purity, but no loss in functionality or functional correctness. It's better than not doing it.

I've not had to use env in probably 15 years. I'd re-evaluate what you're doing if you find it necessary.

Do your own homework, faggot.

...

Even a babby-tier problem can be autistically optimized as a challenge. You afraid to compete, user?

i've been thinking about writing this in bash but i don't think i have the motivation for that

I made this almost three times as fast by keeping the numbers as strings instead of converting them:
from sys import argvfrom collections import Counterfor line in open(argv[1]): nums = line.strip().split(',') [(major, count)] = Counter(nums).most_common(1) print(major if count > len(nums) / 2 else None)
I also checked if reading the file as plain bytes instead of unicode strings made it faster, but that made it a bit slower for some reason.

Python3 likely converts it to unicode when passed to counter. How does that run on python2?

just checked, runs fine on 2.7 with zero modifications.

Counter is a dict subclass. It takes anything hashable, not just strings, and Python is strongly typed, so no unicode conversion. The implementation is simple enough that I'm experimenting with just doing the counting manually.
I tried Python 2, and to my surprise, it was more than two times as slow. Using unicode in Python 2 doesn't help.

counter probably doesn't care, it'll count any data type in any list

I just discovered that I was looking at a pure Python fallback, and CPython runs a C implementation of the critical function, so this is not going to happen.
All the heavy lifting is done through things implemented in C at this point so I don't think it'll get much faster without witchcraft.

Weird - python2 is usually faster. Maybe you've discovered the one case that justifies the existence of python3.

I think it's because Python 2's Counter doesn't have the bottleneck written in C. Specifically, it's these two lines:
for elem in iterable: self[elem] = self_get(elem, 0) + 1
In Python 3 they're moved into a separate function that's replaced by a function of the same name imported from _collections if possible.
Python 3 is a much better language than Python 2, but performance improvements are usually because of things that could be backported to Python 2.

this will do it in 40ms if i can somehow reduce numba's 200ms import time

That guy isn't me. The 99 thing was a joke about the interval notation. Only pretending to be retarded, etc..

you can seperate the numba shit and compile it to a module that can be imported without numba installed, i've thought about compiling it seperatly, compressing it to a string, including it in the source, and then having it decompress the string into a python tempfile, which is hopefully in ram, and importing the module from there, but i'd bet it's going to take longer than 200ms and not be worth the effort

wtf?

real 0m0.001suser 0m0.000ssys 0m0.002s

Where does 933 come from?

What happens if you use go run?

real 0m0.132suser 0m0.140ssys 0m0.025s

How do you build/run go so it goes fast? I could try running it here.

Not sure what you mean. I have go version go1.9.1 linux/amd64. Then time the go build/go run.

Running it here, it doesn't produce the same output. Seems to give up after 5 tests.

root@sid:~/src# ./809861 test.txtNone00NoneNone

This is the test data I'm using:

mega.nz/#!VPoAWRwQ!MNyWOflyP3HbfCrcCn0WRp008dQJPyixKgBAUphC240

$ time ./challenge test.txtNone00NoneNonereal 0m0.014suser 0m0.014ssys 0m0.001s

(checked)
Yeah, but there's 40 lines in that file, not 5.

I thought all the numbers had to be positive. Whoops.
Here's it is now. I hope it works this time...
#include #include #define MAXLEN 30000int major_element(int *, int);int main(int argc, char *argv[]){ FILE *fp; char c; char snum[6]; char *p = snum; int numbers[MAXLEN]; int len = 0; int major; if(argc < 2){ printf("Need an argument\n"); return 0; } fp = fopen(argv[1], "r"); if(fp == NULL){ printf("couldn't open file\n"); return 0; } while((c = fgetc(fp)) != EOF){ if('0' len/2){ return major; } return -1;}

Those are the numbers from your file. It is still 14ms, nowhere near 933. Where that number comes from?

Sorry I don't understand. pls no bully

What I mean is, the test.txt file has 40 lines, 40 tests. The output of your program only shows 5 tests. It's skipping a huge amount of the test data for some reason.

I cut your program down to something that should just print line by line (I think, as I don't know go):

package mainimport ( "bufio" "fmt" "os")func process(filename string) { f, err := os.Open(filename) if err != nil { panic(err) } defer f.Close() s := bufio.NewScanner(f) for s.Scan() { fmt.Println("None") }}func main() { if len(os.Args) != 2 { fmt.Printf("usage: %s [input]\n", os.Args[0]) return } process(os.Args[1])}

But it still only prints 5 lines. Can you see what's up? is "for s.Scan()" doing the right thing?

imo this might end up being a bug in go.

It is not my program. I didn't check the code, I just copied the source code and tried the test. I don't program in Go. I will try to fix it.

Ahaha, yeah, it's a bug/"feature" in go. Great job, faggots at Google.
If you add,
if err := s.Err(); err != nil { log.Fatal(err) }
The silent error the scanner is generating (what the fuck kind of systems language silently truncates a file?) will get logged:

2017/10/26 17:01:01 bufio.Scanner: token too longexit status 1
The textbook 'iterate over lines of a file' go example will fail on long lines. These aren't even that long. What a fucking disaster. Rather than fix it, they documented it,
Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.
Why would you leave garbage like that in a language? It's just a landmine for novice users to create security holes with.

Isn't failing mysteriously on long lines an ancient Unix tradititon?

The golang community, everyone:
twitter.com/davecheney/status/604837853344989184?lang=en

fucking lol

don't even waste your time, fuck Go

It is well documentated and it works like advertised. What is the problem?

golang.org/pkg/bufio/#Scanner

this reads as being literally no different than a readlines function but with arbitrary undocumented limits. is there some kind of performance benefit?

gets() was also well documented and worked as advertised.
Look at what happened here - all over twitter and stackoverflow are people recommending using Scanner to iterate over lines even though it will fuck up (silently!) as soon as it's fed a long line, for some unknown definition of long. As a result, user wound up learning the wrong way to do things and produced broken code. Something that broken should not be part of a language as what happened here can happen.

I think the main reason is using Reader is a pain in the ass compared to Scanner. I was just trying it and even converting bytes[] to a string is a pain in the ass.

you people just need to learn the correct level of abstraction is

New hopefully correct time now.

$ time ./challenge test.txt None00NoneNoneNoneNoneNoneNone0None0NoneNoneNone00NoneNone00NoneNone00NoneNoneNoneNone00NoneNoneNone00NoneNoneNoneNonereal 0m0.041suser 0m0.041ssys 0m0.003s

Just increased buffer size to the anons code.

s := bufio.NewScanner(f)const maxCapacity = 512*1024buf := make([]byte, maxCapacity)s.Buffer(buf, maxCapacity)

Jesus what a terrible community. I guess they all came from webdev so it figures.

How hard is it to change it to accept any line length?

maybe use the readlines thing, get the max lengths of the lines, and then use scanner on it again so you can get the correct level of abstraction

Almost everything when searching for how to iterate over multiple lines in go in a safe/general way is wrong. And as a bonus, everyone's leaving out error handling (why is this even optional in 3 CYE?). What a trainwreck.

these faggots resort to shilling their shitty language in stackoverflow

It doesn't fail silently. You have to check for errors (Err). This is working as intended.
If you are too retarded to error check yourself then use Java.

This code is GPL do not steal. My code. Warning my code must give code to use. By reading this you have agreed to give me your code. Thanks you kindly sirs.
#include #include int main(int argc, char **argv){ if(argc == 1) goto retard; FILE *fp = fopen(argv[1], "r"); if(fp == NULL) goto retard; int buf_ind = 0, len = 0; int counts[101] = {0}; char c, buf[5] = {0}, *t; while((c = fgetc(fp)) != EOF){ buf[buf_ind++] = c; if(c == ',' || c == '\n'){ int val = strtol(buf, &t, 10); if(val == 0 && t == buf) goto retard; buf_ind = 0; counts[val] ++; len++; if(c == '\n'){ len >>= 1; int res = -1; for(int i=0; i len) res = i; counts[i] = 0; } len = 0; (res < 0) ? printf("None\n") : printf("%i\n",res); } } } return 0; retard: printf("Retard\n"); return -1;}
Free software. Give me code. Thank you sirs.

There is no excuse for optional error checking in a modern language.

Code I would actually accept in a sysadmin context (performance not that important; reviewability very important; lots of small scripts so something of this scale would actually show up):

(though a binary for any small task is an inconvenience; you have to find the source to be sure of what it does; you can't do a quick fix or add some troubleshooting code)

(JS solution)
... this is depressing. I'll leave it at that. It's basically a list of every entry except the Python entries anyway, which are just fucking revolting, every single one of them. Python's terrible but there's no reason for it to be this bad. What you do with a shitty scripting language is, you solve the problem in a natural and direct way, and if it's a BS website-tester problem with inputs designed to stress a hash-table or whatever, then you say "job done" and move on to more real problems. If your natural and direct way is too slow for a real-world problem, then you try and solve it with architecture (run less often / use caching / change requirements) or you use a more serious language.

the mantra is "0, 1, or Many"
and the mantra is not followed by anything important. Servers have finite RAM, finite CPUs, finite storage. Software running on servers therefore have tunable finite limits. Similarly when you put up a "my music collection" bookshelf you don't let someone base64-encode their 2TB harddrive and store the result in the name of a song, to use your website as their personal backup service.
so yeah fuck infinite lines.

... well the thread's no longer bumping so I guess no need to look for the next terrible Python solution.

lmfao

according to this shitty website with a terrible spec and unexaminable inputs and no error messages or ability to troubleshoot. Real world, problems with these would just be fixed, very easily.
It doesn't much apply here, but in general, optimization is specialization: if you have multiple platforms, or if the upstream changes, or if a bug causes some of the data to be (correctibly) wrong, then it's usually easier to adapt a general solution than an optimized one.

I always do general solutions until I need optimized solutions, but people abuse that to justify absolute garbage that will never be able to be converted into an optimized solution.

there's no wisdom that can't be abused by a fool.

Which one is your attempt?

obv. it's one of the failures.
this pajeet site is just as bad: spoj.com/submit/TEST/
but if you try to submit a solution you'll see how many more languages it offers: Ada, Dart, Forth, Common Lisp, Nim.

"pajeet site" is not a joke

is this challenge serious?
"
TEST - Life, the Universe, and Everything
#basic #tutorial #ad-hoc-1

Your program is to use the brute-force approach in order to find the Answer to Life, the Universe, and Everything. More precisely... rewrite small numbers from input to output. Stop processing input after reading in the number 42. All numbers at input are integers of one or two digits."
Example
Input:12884299Output:1288

....
#!/bin/python2.7while(1==1): input=raw_input() if input=='88': print input else: break

#!/bin/python2.7import localeif locale.getdefaultlocale()[0]=='hi_IN': print 'kill yourself' quit()pajeet_detected=Falsewhile(1==1): input=raw_input() if input!='88' and pajeet_detected!=True: pajeet_detected=True print input else: break

it's 3 am suck it
#!/bin/python2.7import localeif locale.getdefaultlocale()[0]=='hi_IN': print 'kill yourself' quit()pajeet_detected=Falsewhile(1==1): input=raw_input() if input!='88' and pajeet_detected!=True: print input else: pajeet_detected=True

improved
#!/bin/python2.7from multiprocessing import Processimport localedef worker(): while(1==1): print 'kill yourself' p=Process(target=worker) p.start()if '_IN' in locale.getdefaultlocale()[0]: p=Process(target=worker) p.start()pajeet_detected=Falsewhile(1==1): input=raw_input() if input!='88' and pajeet_detected!=True: print input else: pajeet_detected=True

#!/usr/bin/pajeet
poo

fucking pajeet. #!/usr/bin/env python2.7
also stop using python 2 you retard

I confirmed the pajeetsite uses valgrind to determine memory usage

Literally you.

Where is the file to test the algorithm?
I want to test and enhance my code, but I don't want to create an account on the shit website.

Here is an official sample from the site
pastebin.com/raw/S4Hc6tgw

i like how all the first few posts are people bitching about having to program.
code competitions should be a regular thing on Holla Forums, too bad only like 7 people on the entire board can program worth a damn

oy vey i'm not going to do your homework

i just read this pajeet site more closely
codeeval.com/how_it_works/#ranking

so basically, their business model is, companies post "challenges" and then everyone on the site does the work for free under the guise that one pajeet might get a job out of it. in the mean time all the people who post "challenges" get free work with zero obligation to actually pay anything.

so they companies do pay
codeeval.com/pricing/

if you want to post a (((challenge))) then you have to pay. at which point 40k pajeets rush to complete your challenge in the hopes of a job, and you get 40k different ways to solve your problem.

i was actually thinking about making an account because why not i need some shit to do to pick C up. this is some jewry though

(checked)
Well how about you make the next thread? This one seems to be finished.

this, someone make another one i need something to practice on with C

all the code challenge sites are shit so who cares?

#include int main() { printf("19\n92\nNone\n"); return 0;}

:^)