Refine code-word finding capabilities #56

jakewilliami · 2020-11-05T11:21:03Z

~~Do NOT load into memory all codewords.~~ Completed in 9a58d23, by @dmipeck.¹ Redone using Iterators.product in 993ec21.
Define a cleverer algorithm for random searching.²
Refine code for performance consideration.³

¹ Edited and refined in 02d911c, 2f6e6c5, 3fff9d6, and 6546206.
² The current plan is to have a kind of mutating code. The problem with the random "algorithm" (as it were) is to balance it with the first point in this issue; we do not want to load the universe into memory. See below.
³ After the bad performance with the memory efficient iterator method, we should work as much on performance (regarding time) as we can. This does not mean we need to change the algorithm, it just means we need to be careful. Edited and refined in cebcd76, a5b8d4a, 77ee159, 37f6886, 70dbc29, and 267f501. See below.

The text was updated successfully, but these errors were encountered:

jakewilliami · 2020-11-05T12:23:01Z

Playing around with random search implementation

Attempt One

No better than the old one, as it ignores the first point in this issue.

Base.rand(𝒰::UniverseParameters) = rand(𝒰.Σ)

Base.rand(𝒰::UniverseParameters, C::AbstractArray) = rand.((C, 𝒰.Σ, 1:𝒰.n))

function replace_if_allowed!(C::AbstractArray, d::Integer, w, w′)
	for c in C
		if hamming_distance(c, w′) < d
			return false
		end
	end
	
	replace!(C, w => w′)
	return true
end

replace_if_allowed(C::AbstractArray, d::Integer, w, w′) =
	replace_if_allowed!(copy(C), d, w, w′)

function mutate_codeword(w::Union{Tuple{T}, NTuple{N, T}}, n::Integer, i::Integer, a::T) where {N, T}
	w̲ = collect(w)
	w̲[i] = a
	
	return ntuple(j -> w̲[j], n)
end

C = get_codewords_greedy(𝒰, d)
	
for _ in 1:length(C)
    # set/reset counter
    j = 0
    # try to mutate the code so that the new word fits (try up to m times)
    for _ in 1:m
        # choose a random word, letter, and position in the word
        w, a, i = rand(𝒰, C)
        # mutate a copy of the word
        w′ = mutate_codeword(w, 𝒰.n, i, a)
        # try to alter the code
        replace_if_allowed!(C, d, w, w′) && break
        # increment counter
        j += 1
        # if no viable option is found for a mutation, remove the word from the code.
        isequal(j, m) && filter!(e -> e ≠ w, C)
    end
end
	
return C

Attempt Two

Garden of forking paths.

# choose random starting point
init_arr = rand(𝒰.Σ, 𝒰.n)
init_word = ntuple(j -> init_arr[j], length(init_arr))
# initialise array with random word
C = [init_word]
# iterate through words
for w in 𝒰
    C′ = copy(C)
    # find a word that will fit
    for w′ in 𝒰
        push_if_allowed!(C′, w′, d)
    end
    next_w = rand(𝒰.Σ, 𝒰.n)
    C′ = nothing
	
    push!(C, next_w)
end
	
return C

Attempt Three

As we were previously, but the only random is the start.

# choose random starting point
init_arr = rand(𝒰.Σ, 𝒰.n)
init_word = ntuple(j -> init_arr[j], length(init_arr))
# initialise array with random word
C = [init_word]
# iterate through words
for w in 𝒰
    push_if_allowed!(C′, w′, d)
end
	
return C

Attempt Four

I found this solution entirely by accident, while I was following up an outdated lead for what I thought might have been a solution (Random.randperm). Thank God the people over at JuliaCollections put the hard work into this one... A strange problem it was indeed. The only limitation here I can think of is another dependency (namely, IterTools).

C = Tuple[]
for _ in 1:length(𝒰)
    # https://github.com/JuliaCollections/IterTools.jl/blob/master/src/IterTools.jl#L610-L689
    push_if_allowed!(C, nth(𝒰, rand(1:length(𝒰))), d)
end
	
return C

Note

I suspect I know which one we will use (the cleanest one that is using pre-built functions as part of JuliaCollections, I suspect). However, I would like to request @dmipeck to review the current options, as he was the one who suggested mutating.

Using any of these methods is still considerably slower than the memory inefficient version. For example, consider the benchmarks of (respectively) the original and the fourth implementation of the random algorithms:

julia> @btime get_codewords_random(["a", "b", "c"], 4, 2); # original
  86.998 μs (1988 allocations: 97.67 KiB)

julia> @btime get_codewords_random(["a", "b", "c"], 4, 3); # "memory efficient"
  12.237 ms (89790 allocations: 2.45 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 6, 3); # original
  4.225 ms (130948 allocations: 8.01 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 6, 3); # "memory efficient"
  1.188 s (8633499 allocations: 272.82 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 8, 3); # original
  377.728 ms (10969724 allocations: 836.81 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 8, 3); # "memory efficient"
  100.025 s (738611869 allocations: 25.82 GiB)

jakewilliami · 2021-01-02T06:23:50Z

$ ./test/benchmark.jl # original
  3.527 s (54961978 allocations: 3.89 GiB)

$ ./test/benchmark.jl # after changing abstract_types.jl
  3.454 s (53511456 allocations: 3.85 GiB)

$ ./test/benchmark.jl # after changing algebra.jl
  3.342 s (53767559 allocations: 3.85 GiB)

$ ./test/benchmark.jl # after changing distance.jl
  2.580 s (31648208 allocations: 2.64 GiB)

$ # only minor changes to levenshtein.jl

$ ./test/benchmark.jl # after changing messages.jl, and in the process adding a Word struct and refining abstract_types.jl for said type
  1.124 s (10795017 allocations: 485.51 MiB)

$ ./test/benchmark.jl # after changes to rref.jl and utils.jl
  1.129 s (10767247 allocations: 483.70 MiB)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine code-word finding capabilities #56

Refine code-word finding capabilities #56

jakewilliami commented Nov 5, 2020 •

edited

Loading

jakewilliami commented Nov 5, 2020 •

edited

Loading

jakewilliami commented Jan 2, 2021

Refine code-word finding capabilities #56

Refine code-word finding capabilities #56

Comments

jakewilliami commented Nov 5, 2020 • edited Loading

jakewilliami commented Nov 5, 2020 • edited Loading

Playing around with random search implementation

Attempt One

Attempt Two

Attempt Three

Attempt Four

Note

jakewilliami commented Jan 2, 2021

jakewilliami commented Nov 5, 2020 •

edited

Loading

jakewilliami commented Nov 5, 2020 •

edited

Loading