Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine code-word finding capabilities #56

Open
2 of 3 tasks
jakewilliami opened this issue Nov 5, 2020 · 2 comments
Open
2 of 3 tasks

Refine code-word finding capabilities #56

jakewilliami opened this issue Nov 5, 2020 · 2 comments

Comments

@jakewilliami
Copy link
Owner

jakewilliami commented Nov 5, 2020

  • Do NOT load into memory all codewords. Completed in 9a58d23, by @dmipeck.1 Redone using Iterators.product in 993ec21.
  • Define a cleverer algorithm for random searching.2
  • Refine code for performance consideration.3

1 Edited and refined in 02d911c, 2f6e6c5, 3fff9d6, and 6546206.
2 The current plan is to have a kind of mutating code. The problem with the random "algorithm" (as it were) is to balance it with the first point in this issue; we do not want to load the universe into memory. See below.
3 After the bad performance with the memory efficient iterator method, we should work as much on performance (regarding time) as we can. This does not mean we need to change the algorithm, it just means we need to be careful. Edited and refined in cebcd76, a5b8d4a, 77ee159, 37f6886, 70dbc29, and 267f501. See below.

@jakewilliami
Copy link
Owner Author

jakewilliami commented Nov 5, 2020

Playing around with random search implementation

Attempt One

No better than the old one, as it ignores the first point in this issue.

Base.rand(𝒰::UniverseParameters) = rand(𝒰.Σ)

Base.rand(𝒰::UniverseParameters, C::AbstractArray) = rand.((C, 𝒰.Σ, 1:𝒰.n))

function replace_if_allowed!(C::AbstractArray, d::Integer, w, w′)
	for c in C
		if hamming_distance(c, w′) < d
			return false
		end
	end
	
	replace!(C, w => w′)
	return true
end

replace_if_allowed(C::AbstractArray, d::Integer, w, w′) =
	replace_if_allowed!(copy(C), d, w, w′)

function mutate_codeword(w::Union{Tuple{T}, NTuple{N, T}}, n::Integer, i::Integer, a::T) where {N, T}
	w̲ = collect(w)
	w̲[i] = a
	
	return ntuple(j -> w̲[j], n)
end

C = get_codewords_greedy(𝒰, d)
	
for _ in 1:length(C)
    # set/reset counter
    j = 0
    # try to mutate the code so that the new word fits (try up to m times)
    for _ in 1:m
        # choose a random word, letter, and position in the word
        w, a, i = rand(𝒰, C)
        # mutate a copy of the word
        w′ = mutate_codeword(w, 𝒰.n, i, a)
        # try to alter the code
        replace_if_allowed!(C, d, w, w′) && break
        # increment counter
        j += 1
        # if no viable option is found for a mutation, remove the word from the code.
        isequal(j, m) && filter!(e -> e  w, C)
    end
end
	
return C

Attempt Two

Garden of forking paths.

# choose random starting point
init_arr = rand(𝒰.Σ, 𝒰.n)
init_word = ntuple(j -> init_arr[j], length(init_arr))
# initialise array with random word
C = [init_word]
# iterate through words
for w in 𝒰
    C′ = copy(C)
    # find a word that will fit
    for w′ in 𝒰
        push_if_allowed!(C′, w′, d)
    end
    next_w = rand(𝒰.Σ, 𝒰.n)
    C′ = nothing
	
    push!(C, next_w)
end
	
return C

Attempt Three

As we were previously, but the only random is the start.

# choose random starting point
init_arr = rand(𝒰.Σ, 𝒰.n)
init_word = ntuple(j -> init_arr[j], length(init_arr))
# initialise array with random word
C = [init_word]
# iterate through words
for w in 𝒰
    push_if_allowed!(C′, w′, d)
end
	
return C

Attempt Four

I found this solution entirely by accident, while I was following up an outdated lead for what I thought might have been a solution (Random.randperm). Thank God the people over at JuliaCollections put the hard work into this one... A strange problem it was indeed. The only limitation here I can think of is another dependency (namely, IterTools).

C = Tuple[]
for _ in 1:length(𝒰)
    # https://github.com/JuliaCollections/IterTools.jl/blob/master/src/IterTools.jl#L610-L689
    push_if_allowed!(C, nth(𝒰, rand(1:length(𝒰))), d)
end
	
return C

Note

I suspect I know which one we will use (the cleanest one that is using pre-built functions as part of JuliaCollections, I suspect). However, I would like to request @dmipeck to review the current options, as he was the one who suggested mutating.

Using any of these methods is still considerably slower than the memory inefficient version. For example, consider the benchmarks of (respectively) the original and the fourth implementation of the random algorithms:

julia> @btime get_codewords_random(["a", "b", "c"], 4, 2); # original
  86.998 μs (1988 allocations: 97.67 KiB)

julia> @btime get_codewords_random(["a", "b", "c"], 4, 3); # "memory efficient"
  12.237 ms (89790 allocations: 2.45 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 6, 3); # original
  4.225 ms (130948 allocations: 8.01 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 6, 3); # "memory efficient"
  1.188 s (8633499 allocations: 272.82 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 8, 3); # original
  377.728 ms (10969724 allocations: 836.81 MiB)

julia> @btime get_codewords_random(["a", "b", "c"], 8, 3); # "memory efficient"
  100.025 s (738611869 allocations: 25.82 GiB)

@jakewilliami
Copy link
Owner Author

$ ./test/benchmark.jl # original
  3.527 s (54961978 allocations: 3.89 GiB)

$ ./test/benchmark.jl # after changing abstract_types.jl
  3.454 s (53511456 allocations: 3.85 GiB)

$ ./test/benchmark.jl # after changing algebra.jl
  3.342 s (53767559 allocations: 3.85 GiB)

$ ./test/benchmark.jl # after changing distance.jl
  2.580 s (31648208 allocations: 2.64 GiB)

$ # only minor changes to levenshtein.jl

$ ./test/benchmark.jl # after changing messages.jl, and in the process adding a Word struct and refining abstract_types.jl for said type
  1.124 s (10795017 allocations: 485.51 MiB)

$ ./test/benchmark.jl # after changes to rref.jl and utils.jl
  1.129 s (10767247 allocations: 483.70 MiB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@jakewilliami and others